provided by Filterd
The Event Filter object defines a stack of simple event log filters that can be applied on an event log to preprocess the event log. See Examples.
Import/Export
You can import/export an Event Filter from/to an .xpmef
file.
Usage
You can create and edit Event Filter objects by
- the Filter Event Log plugin
- the Event Filter on Event Log Visualizer
Simple Event Log Filters
The Event Filter object is a stack of filters of three different basic types
- Select traces… Filters
- Project traces on events… filters
- Aggregation and feature engineering filters
Select Traces Filters
Select Traces Filters remove entire traces from the log based on a predicate over the trace attributes and event attributes (of events in the trace)
Select traces by start event
Include only those traces that begin with a specific event type,
- Configuration:
- Classifier or event attributes by which to classify events
- List of start event types according to chosen classifier, select those events that are allowed start events
- Use when: filtering out incomplete traces or when focusing on traces starting in a particular fashion
Select traces by end event
Include only those traces that finish with a specific event type
- Configuration: see select traces by start event
- Use when: filtering out incomplete traces or when fishing on traces end in a particular fashion/case outcomes
Select traces by frequency
Include only those trace variants that occur at least/at most N times or P% in the log
- configuration
- Classifier by which to classify events
- Choice whether to filter by absolute or relative frequency
- Slider with minimum and maximum allowed frequency of trace variant
- Use when: focusing on globally frequent behavior in the process or when analyzing specifically the outliers in the process
- Note: event filter calculates trace variants when configuration is loaded and whenever the event classifier is changes. This may take some time.
Select traces by trace attribute
Include or exclude those traces with a specific trace attribute value
- Configuration:
- Trace attribute to filter on
- Values of trace attribute, select which values are allowed/disallowed
- Whether to keep or remove traces with chosen values
- Use when: focusing on particular types of cases (customer types, patients, case outcomes if recorded in trace attributes)
Select traces by event attribute
Include or exclude those traces that have an event with a specific event attribute
- Configuration:
- The event attribute to filter on
- Values of chosen event attribute, select which values are allowed/disallowed
- Whether to keep or remove traces containing an event with the chosen values
- Option to keep or remove traces where no event has the attribute specified
- Use when: focusing on particular types of behavior/cases that is not recorded in trace attributes, such as dynamic values/updates occurring only at event level (e.g., payment amounts in a payment event, particular actor/resource involved in a trace)
Select traces by directly/eventually follows of events
Include or exclude those traces where two events with specific event attributes occur (directly) after each other, for example activity A directly follows activity B
- Configuration
- Event attribute/classifier
- Two lists of values of the chosen event attribute, select the values of the first event (left) and of the second event (right) that may (not) follow each other
- Drop-down list whether events that match the chosen values must (not) follow each other directly/eventually
- Additional option to limit
- time window during which must (not) follow each other
- Event attributes of related events (eg worked on by same/different resources)
- Use when: focusing on behavior that follows specific paths through the process, e.g. specific decision outcomes, long-distance dependencies, repetitions. Often useful to remove specific deviating but frequent behavior where specific activities occur out of order compared to the main behavior
Select traces by performance
Include or exclude traces taking more/less than a given amount of time
- Configuration:
- minimum and maximum total time from first to last event in the trace
- Whether traces shall be included or excluded if they fall within the interval
- Use when: analyzing and separating long running cases from fast running cases
- Hints:
- When filtering of performance between two specific activities in the middle of a case, use Select traces by directly/eventually follows of events with the additional option to limit the duration between the events
- When filtering for absolute time values, use Select traces by timeframe
Select traces by timeframe
Include or exclude traces of their time stamps fall within a specific time period (eg January 1982)
- Configuration:
- Start and end time of time window based on time values available in the event log
- Whether traces shall be in/excluded if they start/overlap/end in the chosen time period
- Use when: analyzing specific seasons or when separating the event log due to concept drift (seasonal/time-based changes in the process)
Select trace sample
Retain a random sample of traces
- Configuration
- Percentage of traces to retain
- Use when:
- prototyping an analysis on a very large log and running times of analysis (discovery, conformance checking, complex filtering) is too time-consuming for rapid prototyping
- Hint
- When building a complex filter stack, make the random sample your first filter, this will reduce the size of the event log and make exploration faster as you build your analysis. When you are done, remove the first filter: now your filter stack can run on the full dataset in exactly the same way.
- If you only realize that your filtering stack takes too long, you can also introduce a random sample filter at any time and move it to the top of your filter stack. This will reduce the sample size and give you shorter feedback loops
Project Traces Filters
The “project traces on events…” filters keep all traces, but remove individual events from those traces – in other words, they project the sequence of events onto a subset of allowed events. In comparison, the Select Traces Filters keep all events of the traces they retain, only entire traces are removed/kept.
Project traces on events by event attributes
Includes or excluded those events which have a specific attribute/value combination
- Configuration
- Whether to include or exclude events with matching attribute/value pairs
- The event attribute based on which the events to include/exclude are determined
- List of values of the chosen event attribute, mark all values for which events shall be included/excluded
- Whether events without this attribute shall be kept or removed
- Whether traces from which all events were removed through projection (ie empty traces) shall be kept or removed
- Use when:
- Focusing on a specific part of the process
- Focusing on a specific dynamic (eg specific resources, specific updates)
- Removing irregular/chaotic events
- Removing events with specific life-cycle steps
Project traces on events by event frequency
Include/exclude all events of a particular type depending on how frequent the event type occurs in the log
- Configuration
- Event classifier by which to determine frequency of event types
- Minimum and maximum absolute/relative frequency of events
- Whether selected event types shall be included or excluded
- Use when:
- Focusing on the frequent activities in the data and removing infrequent activities that show rare paths and exceptional behavior
- Hint: when focusing on exceptional and rare behavior, do not use the frequency filter but use Select traces by event attribute as this will ensure that also the frequent events in the rare cases are kept
Project traces by trimming on start/end
Includes only a subsequence of events between determined start and end points and excludes all preceding and succeeding behavior
- Configuration
- The event attribute and values for the start point and for the end point of the subsequence (including both)
- In case a trace contains multiple matching events for start/end whether the first or the last match should be chosen as start/end point
- Use when:
- Focusing on a specific continuous part in the middle of a process, allows reducing complexity by cutting out the behavior before and after
Aggregation Filter
Allow to aggregate several events into a new event (removes the events that were aggregated) but keeps all traces and all other events, filter configuration determines how the event attributes of the new event are defined from the event attribute of the aggregated events
Merge subsequent events
Aggregates sequences of events of the same type into a new event
- Configuration
- Attribute to classify which events are considered equal for aggregation (eg activity name)
- Values for which events shall be aggregated
- Possibly additional attribute and values that must match for all events in the sequence
- Whether the aggregated event that is created shall
- Have timestamp and attribute values of first or last event of sequence
- Be two events with start and complete lifecycle
- Use when: the log contains repetitive sequences of low level events that are not relevant for the analysis (eg sequences of sensor readings or data updates)
Feature Engineering Filters
Derive new attributes or other features in the event log from the existing traces and events. All existing traces and events remain, but may get additional attributes.
Compose categorical attribute
Construct a new event level attribute by composing two or more existing attributes by concatenation their values, the composed attribute gets added to the log as global attribute (defined for all events) and as new classifier (so you can use it in subsequent filters)
- Configuration: two lists of attributes, you can select multiple attributes in each list. The composed attribute consists of
- The attributes in the first list (in alphabetical order), followed by
- The attributes in the second list (in alphabetical order)
- Use when:
- You want to use a specific attribute as event classifier but it isn’t declared as an event classifier. In this case, select the attribute as single entry in the first list and no attribute in the second list
- You want to refine event classes based on further attributes, examples
- Activity name + life-cycle
- Document name + Activity name