[
https://issues.apache.org/jira/browse/PIO-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875107#comment-15875107
]
Pat Ferrel commented on PIO-45:
-------------------------------
The latest SelfCleaningDatasource still has some issues. I created a template
that only cleans the EventServer and created an integration test to illustrate.
1) the following $sets in pseudo code:
{code:java}
Nexus,$set, "categories" :["Tablets"], today
Nexus,$set, "categories" :["Tablets", "Electronics"], today - 2 days
Nexus,$set, "categories" :["Tablets", "Electronics", "Google"], today - 6 days
{code}
should aggregate today into
{code:java}
Nexus,$set, "categories": ["Tablets"]
{code}
But the actual value comes out:
{code:java}
Nexus,$set, "categories": ["Tablets", "Electronics", "Google"]
{code}
The aggregate is basically the last/most recent $set/$unset for any named
property.
For a given object all properties touched over all time are aggregated, not all
values each property has taken on. The values are only from the most recent
$set.
So unique properties accumulate until $delete of the object. But the most
recent $set wins in aggregation.
This seem super important to get right since this is the only method to trim
and compact the EventStore and without it working correctly, events accumulate
forever.
> SelfCleaningDatasource erases all data
> --------------------------------------
>
> Key: PIO-45
> URL: https://issues.apache.org/jira/browse/PIO-45
> Project: PredictionIO
> Issue Type: Bug
> Affects Versions: 0.10.0-incubating
> Reporter: Pat Ferrel
> Assignee: Alexander Merritt
> Priority: Blocker
> Fix For: 0.11.0
>
> Attachments: import_handmade_simple.py,
> sample-time-window-and-downsample-data.txt
>
>
> as integrated into the UR, in the integration-test, the SelfCleaningDataset
> erases all data. This feature works fine in the AML version of PIO.
> Although not tested one could assume that this would be true with any other
> Datasource in other templates.
> [~emergentorder] can you check to see if the PIO merge was done correctly.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)