Github user pferrel commented on a diff in the pull request:
https://github.com/apache/incubator-predictionio/pull/269#discussion_r73793283
--- Diff:
examples/scala-parallel-similarproduct/filterbyyear/src/main/scala/DataSource.scala
---
@@ -13,16 +16,21 @@ import org.apache.spark.rdd.RDD
import grizzled.slf4j.Logger
-case class DataSourceParams(appId: Int) extends Params
+case class DataSourceParams(appName: String, eventWindow:
Option[EventWindow], appId: Int) extends Params
class DataSource(val dsp: DataSourceParams)
extends PDataSource[TrainingData,
- EmptyEvaluationInfo, Query, EmptyActualResult] {
+ EmptyEvaluationInfo, Query, EmptyActualResult] with
SelfCleaningDataSource {
+
+ @transient override lazy val logger = Logger[this.type]
- @transient lazy val logger = Logger[this.type]
+ override def appName = dsp.appName
+ override def eventWindow = dsp.eventWindow
override
def readTraining(sc: SparkContext): TrainingData = {
+ val events = cleanPersistedPEvents(sc)
+
val eventsDb = Storage.getPEvents()
// create a RDD of (entityID, User)
--- End diff --
ready for review
adds vals needed by the SelfCleaningDatasource trait as an example and for
tests.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---