Exercise 5

For this exercise we will use exercise5.xes which is the result of a simulation.

  1. Start with a log inspection on exercise5.xes
  2. Create some process model(s) using two (or more) of the plug-ins from the previous exercises.
  3. Run the dotted chart analysis plug-in on exercise5.xes and try to answer the following questions:
    1. What do you see using the default settings? What does each row, column and dot represent?
    2. What does the diagonal line from the top left to the middle of the last row represent?
    3. Using the default settings, what do you think is the meaning of the row of dots that is located below and to the left of the diagonal line?
    4. Can we see a clear division of tasks that are executed in the beginning and (repeated) near the end of the process?
    5. Change the settings to “Time option: Relative(Time)” and “Sort By: Actual duration”. What do we see now?
    6. Change the settings to “Component type: Originator” , “Time option: Actual” and “Sort By: None”. What do we see now?
    7. Using these settings, can we distinguish groups of users doing similar tasks?

Solutions

  1. There is not really anything special to discover during log inspection in this case.
  2. The result of the alpha-algorithm is shown in the top figure below and the result of the Fuzzy Miner in the bottom figure below. Each plug-in should come up with similar process models.
    1. When you run the Dotted Chart on exercise5.xes you get a result similar to the figure below, although your colors can be different. Each horizontal line is one case, as the “Component Type” is set to “Instance ID”. On the x-axis on top you see dates so from left to right you progress through time. Each dot is an event as recorded in the event log. Each color refers to a particular activity, as the “Color By” option is set to “Task ID”.
    1. The diagonal line highlighted in the figure below indicates the start of new cases. Since it is logical that cases with a higher identification number start later in time, the line of start events will not be vertical but actually progress diagonal.
    1. In the figure above these sections of dots are circled. If you select dots in ProM you will see that the identification number of the case is between 2 and 9. The sorting algorithm sorts the numbers on their first digit so the order in the graph is 19-2-20-21. Can you spot case number 100?
    2. Yes, if we look again at two figures back we see that at the beginning of cases ‘red’ events are executed which do not appear later in time for a case.
    3. The resulting dotted chart is shown in the figure below. Time is made relative from the first event for each case and cases are sorted with the shortest on top. This gives a good indication of the time distribution amongst cases. We can for instance see that some cases are processed within 30 or 60 days. The longest case however lasts for 720 days after its first registered event.
    1. The result is shown in the figure below. What we see now is which user executes which activity at what time.
    1. It appears that there are 4 groups of users doing similar tasks. Anne and Mike do similar tasks as does the group of Carol, John, Mary, Pam, Pete, Sam and Sara. Then there is Wil which only performs one type of tasks and he is the only one that performs this task. Then there is the ‘_INVALID’ ‘user’ which `executes’ all kinds of activities.