Signavio challenge
This commit is contained in:
2
.gitignore
vendored
2
.gitignore
vendored
@@ -1 +1,3 @@
|
||||
.idea/
|
||||
.gradle/
|
||||
build/
|
||||
|
||||
48
signavio/README.md
Normal file
48
signavio/README.md
Normal file
@@ -0,0 +1,48 @@
|
||||
# Signavio Backend Coding Challenge
|
||||
Thank you very much for taking an interest in working at Signavio! After your first interview with us, we'd like to get to know your approach to problem solving with this coding challenge.
|
||||
|
||||
This challenge is actually closely related to the requirements of one of our products: Process Intelligence. This product enables Signavio customers to analyse big data generated from their processes and lets them recognize the gaps and variants from the intended business process (to-be) and the operating technical processes (as-is). Find out more about Process Intelligence [here](https://www.signavio.com/products/process-intelligence/)!
|
||||
|
||||
## The Goal
|
||||
Please solve the challenge below by meeting the given acceptance criteria. Push your solution to this private GitHub repository and please do not publish your solution in any public GitHub repository or anywhere else!
|
||||
|
||||
Should you have any questions regarding the challenge, please email: coding.challenge.backend@signavio.com
|
||||
|
||||
In general, feel free to make assumptions and have fun with the challenge!
|
||||
Please make sure you focus your time on explaining and elaborating your thought process, assumptions, decisions, thoughts and comments, for example inside a `thought-process.md` file. Please also document how to build and run your code.
|
||||
|
||||
## Tech Stack
|
||||
* Java or Kotlin
|
||||
* Gradle or Maven
|
||||
* No external databases, do all calculations in memory to avoid complexity
|
||||
* Test code where you feel it makes sense
|
||||
|
||||
## Details
|
||||
A Signavio customer, a multinational company that builds industrial appliances, has an internal system dedicated to procuring (buying) any and all resources the company requires to operate. Procurement is done via the company's own ERP (Enterprise Resource Planning) system.
|
||||
|
||||
A typical business process represented the ERP system is "procure-to-pay", which generally involves the following activities:
|
||||
* create purchase request
|
||||
* request approved
|
||||
* create purchase order
|
||||
* select supplier
|
||||
* receive goods
|
||||
* pay invoice
|
||||
|
||||
Whenever the company wants to buy something, they do so through their ERP system.
|
||||
|
||||
The company buys many resources, always using their ERP system. Each resource purchase can be considered a case, or single instance of this process. As it happens, the actual as-is process often deviates from the ideal to-be process. Sometimes purchase requests are raised but never get approved, sometimes a supplier is selected but the goods are never received, sometimes it simply takes a long time to complete the process, and so on. We call these deviations from the process path variants.
|
||||
|
||||
The customer provides us with extracted process data from their existing ERP (Enterprise Resource Planning) system. The customer extracted one of their processes for analysis: Procure to Pay. The logfiles contain 3 columns:
|
||||
* activity name
|
||||
* case id
|
||||
* timestamp
|
||||
|
||||
We want to analyse and compare process instances (cases) with each other. You can find the sample data set [here](samples/Activity_Log_2004_to_2014.csv).
|
||||
|
||||
## Acceptance Criteria
|
||||
* Aggregate cases that have the same event execution order and list the 10 variants with the most cases.
|
||||
* Provide your output as JSON, choose a structure that makes sense.
|
||||
* As that output is used by other highly interactive components, we need to be able to get the query results in well under 50 milliseconds.
|
||||
|
||||
## Example
|
||||

|
||||
51
signavio/build.gradle
Normal file
51
signavio/build.gradle
Normal file
@@ -0,0 +1,51 @@
|
||||
plugins {
|
||||
id "application"
|
||||
id 'java'
|
||||
id 'com.diffplug.gradle.spotless' version '3.25.0' // code formatter
|
||||
}
|
||||
|
||||
apply plugin: "java"
|
||||
apply plugin: "com.diffplug.gradle.spotless"
|
||||
|
||||
group = 'com.marqusm.signavio'
|
||||
version = '0.0.2'
|
||||
sourceCompatibility = '11'
|
||||
|
||||
ext {
|
||||
javaMainClass = "com.marqusm.signavio.processintelligence.Main"
|
||||
}
|
||||
|
||||
application {
|
||||
mainClassName = javaMainClass
|
||||
}
|
||||
|
||||
configurations {
|
||||
compileOnly {
|
||||
extendsFrom annotationProcessor
|
||||
}
|
||||
}
|
||||
|
||||
spotless {
|
||||
java {
|
||||
googleJavaFormat()
|
||||
}
|
||||
}
|
||||
|
||||
repositories {
|
||||
mavenCentral()
|
||||
}
|
||||
|
||||
dependencies {
|
||||
compile 'org.apache.commons:commons-csv:1.7'
|
||||
compile 'com.google.code.gson:gson:2.8.6'
|
||||
|
||||
compile 'org.slf4j:slf4j-api:1.7.28'
|
||||
compile 'ch.qos.logback:logback-classic:1.2.3'
|
||||
compileOnly 'org.projectlombok:lombok:1.18.10'
|
||||
annotationProcessor 'org.projectlombok:lombok:1.18.10'
|
||||
implementation 'org.apache.commons:commons-lang3:3.9'
|
||||
implementation 'com.google.guava:guava:28.0-jre'
|
||||
|
||||
testCompile 'junit:junit:4.12'
|
||||
testImplementation 'org.assertj:assertj-core:3.13.2'
|
||||
}
|
||||
BIN
signavio/gradle/wrapper/gradle-wrapper.jar
vendored
Normal file
BIN
signavio/gradle/wrapper/gradle-wrapper.jar
vendored
Normal file
Binary file not shown.
6
signavio/gradle/wrapper/gradle-wrapper.properties
vendored
Normal file
6
signavio/gradle/wrapper/gradle-wrapper.properties
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
#Fri Apr 19 10:17:24 CEST 2019
|
||||
distributionBase=GRADLE_USER_HOME
|
||||
distributionPath=wrapper/dists
|
||||
zipStoreBase=GRADLE_USER_HOME
|
||||
zipStorePath=wrapper/dists
|
||||
distributionUrl=https\://services.gradle.org/distributions/gradle-5.6.3-all.zip
|
||||
172
signavio/gradlew
vendored
Normal file
172
signavio/gradlew
vendored
Normal file
@@ -0,0 +1,172 @@
|
||||
#!/usr/bin/env sh
|
||||
|
||||
##############################################################################
|
||||
##
|
||||
## Gradle start up script for UN*X
|
||||
##
|
||||
##############################################################################
|
||||
|
||||
# Attempt to set APP_HOME
|
||||
# Resolve links: $0 may be a link
|
||||
PRG="$0"
|
||||
# Need this for relative symlinks.
|
||||
while [ -h "$PRG" ] ; do
|
||||
ls=`ls -ld "$PRG"`
|
||||
link=`expr "$ls" : '.*-> \(.*\)$'`
|
||||
if expr "$link" : '/.*' > /dev/null; then
|
||||
PRG="$link"
|
||||
else
|
||||
PRG=`dirname "$PRG"`"/$link"
|
||||
fi
|
||||
done
|
||||
SAVED="`pwd`"
|
||||
cd "`dirname \"$PRG\"`/" >/dev/null
|
||||
APP_HOME="`pwd -P`"
|
||||
cd "$SAVED" >/dev/null
|
||||
|
||||
APP_NAME="Gradle"
|
||||
APP_BASE_NAME=`basename "$0"`
|
||||
|
||||
# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
|
||||
DEFAULT_JVM_OPTS=""
|
||||
|
||||
# Use the maximum available, or set MAX_FD != -1 to use that value.
|
||||
MAX_FD="maximum"
|
||||
|
||||
warn () {
|
||||
echo "$*"
|
||||
}
|
||||
|
||||
die () {
|
||||
echo
|
||||
echo "$*"
|
||||
echo
|
||||
exit 1
|
||||
}
|
||||
|
||||
# OS specific support (must be 'true' or 'false').
|
||||
cygwin=false
|
||||
msys=false
|
||||
darwin=false
|
||||
nonstop=false
|
||||
case "`uname`" in
|
||||
CYGWIN* )
|
||||
cygwin=true
|
||||
;;
|
||||
Darwin* )
|
||||
darwin=true
|
||||
;;
|
||||
MINGW* )
|
||||
msys=true
|
||||
;;
|
||||
NONSTOP* )
|
||||
nonstop=true
|
||||
;;
|
||||
esac
|
||||
|
||||
CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar
|
||||
|
||||
# Determine the Java command to use to start the JVM.
|
||||
if [ -n "$JAVA_HOME" ] ; then
|
||||
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
|
||||
# IBM's JDK on AIX uses strange locations for the executables
|
||||
JAVACMD="$JAVA_HOME/jre/sh/java"
|
||||
else
|
||||
JAVACMD="$JAVA_HOME/bin/java"
|
||||
fi
|
||||
if [ ! -x "$JAVACMD" ] ; then
|
||||
die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME
|
||||
|
||||
Please set the JAVA_HOME variable in your environment to match the
|
||||
location of your Java installation."
|
||||
fi
|
||||
else
|
||||
JAVACMD="java"
|
||||
which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
|
||||
|
||||
Please set the JAVA_HOME variable in your environment to match the
|
||||
location of your Java installation."
|
||||
fi
|
||||
|
||||
# Increase the maximum file descriptors if we can.
|
||||
if [ "$cygwin" = "false" -a "$darwin" = "false" -a "$nonstop" = "false" ] ; then
|
||||
MAX_FD_LIMIT=`ulimit -H -n`
|
||||
if [ $? -eq 0 ] ; then
|
||||
if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then
|
||||
MAX_FD="$MAX_FD_LIMIT"
|
||||
fi
|
||||
ulimit -n $MAX_FD
|
||||
if [ $? -ne 0 ] ; then
|
||||
warn "Could not set maximum file descriptor limit: $MAX_FD"
|
||||
fi
|
||||
else
|
||||
warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT"
|
||||
fi
|
||||
fi
|
||||
|
||||
# For Darwin, add options to specify how the application appears in the dock
|
||||
if $darwin; then
|
||||
GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\""
|
||||
fi
|
||||
|
||||
# For Cygwin, switch paths to Windows format before running java
|
||||
if $cygwin ; then
|
||||
APP_HOME=`cygpath --path --mixed "$APP_HOME"`
|
||||
CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
|
||||
JAVACMD=`cygpath --unix "$JAVACMD"`
|
||||
|
||||
# We build the pattern for arguments to be converted via cygpath
|
||||
ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`
|
||||
SEP=""
|
||||
for dir in $ROOTDIRSRAW ; do
|
||||
ROOTDIRS="$ROOTDIRS$SEP$dir"
|
||||
SEP="|"
|
||||
done
|
||||
OURCYGPATTERN="(^($ROOTDIRS))"
|
||||
# Add a user-defined pattern to the cygpath arguments
|
||||
if [ "$GRADLE_CYGPATTERN" != "" ] ; then
|
||||
OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)"
|
||||
fi
|
||||
# Now convert the arguments - kludge to limit ourselves to /bin/sh
|
||||
i=0
|
||||
for arg in "$@" ; do
|
||||
CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -`
|
||||
CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option
|
||||
|
||||
if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition
|
||||
eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"`
|
||||
else
|
||||
eval `echo args$i`="\"$arg\""
|
||||
fi
|
||||
i=$((i+1))
|
||||
done
|
||||
case $i in
|
||||
(0) set -- ;;
|
||||
(1) set -- "$args0" ;;
|
||||
(2) set -- "$args0" "$args1" ;;
|
||||
(3) set -- "$args0" "$args1" "$args2" ;;
|
||||
(4) set -- "$args0" "$args1" "$args2" "$args3" ;;
|
||||
(5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;;
|
||||
(6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;;
|
||||
(7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;;
|
||||
(8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;;
|
||||
(9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;;
|
||||
esac
|
||||
fi
|
||||
|
||||
# Escape application args
|
||||
save () {
|
||||
for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done
|
||||
echo " "
|
||||
}
|
||||
APP_ARGS=$(save "$@")
|
||||
|
||||
# Collect all arguments for the java command, following the shell quoting and substitution rules
|
||||
eval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS "\"-Dorg.gradle.appname=$APP_BASE_NAME\"" -classpath "\"$CLASSPATH\"" org.gradle.wrapper.GradleWrapperMain "$APP_ARGS"
|
||||
|
||||
# by default we should be in the correct project dir, but when run from Finder on Mac, the cwd is wrong
|
||||
if [ "$(uname)" = "Darwin" ] && [ "$HOME" = "$PWD" ]; then
|
||||
cd "$(dirname "$0")"
|
||||
fi
|
||||
|
||||
exec "$JAVACMD" "$@"
|
||||
84
signavio/gradlew.bat
vendored
Normal file
84
signavio/gradlew.bat
vendored
Normal file
@@ -0,0 +1,84 @@
|
||||
@if "%DEBUG%" == "" @echo off
|
||||
@rem ##########################################################################
|
||||
@rem
|
||||
@rem Gradle startup script for Windows
|
||||
@rem
|
||||
@rem ##########################################################################
|
||||
|
||||
@rem Set local scope for the variables with windows NT shell
|
||||
if "%OS%"=="Windows_NT" setlocal
|
||||
|
||||
set DIRNAME=%~dp0
|
||||
if "%DIRNAME%" == "" set DIRNAME=.
|
||||
set APP_BASE_NAME=%~n0
|
||||
set APP_HOME=%DIRNAME%
|
||||
|
||||
@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
|
||||
set DEFAULT_JVM_OPTS=
|
||||
|
||||
@rem Find java.exe
|
||||
if defined JAVA_HOME goto findJavaFromJavaHome
|
||||
|
||||
set JAVA_EXE=java.exe
|
||||
%JAVA_EXE% -version >NUL 2>&1
|
||||
if "%ERRORLEVEL%" == "0" goto init
|
||||
|
||||
echo.
|
||||
echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
|
||||
echo.
|
||||
echo Please set the JAVA_HOME variable in your environment to match the
|
||||
echo location of your Java installation.
|
||||
|
||||
goto fail
|
||||
|
||||
:findJavaFromJavaHome
|
||||
set JAVA_HOME=%JAVA_HOME:"=%
|
||||
set JAVA_EXE=%JAVA_HOME%/bin/java.exe
|
||||
|
||||
if exist "%JAVA_EXE%" goto init
|
||||
|
||||
echo.
|
||||
echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
|
||||
echo.
|
||||
echo Please set the JAVA_HOME variable in your environment to match the
|
||||
echo location of your Java installation.
|
||||
|
||||
goto fail
|
||||
|
||||
:init
|
||||
@rem Get command-line arguments, handling Windows variants
|
||||
|
||||
if not "%OS%" == "Windows_NT" goto win9xME_args
|
||||
|
||||
:win9xME_args
|
||||
@rem Slurp the command line arguments.
|
||||
set CMD_LINE_ARGS=
|
||||
set _SKIP=2
|
||||
|
||||
:win9xME_args_slurp
|
||||
if "x%~1" == "x" goto execute
|
||||
|
||||
set CMD_LINE_ARGS=%*
|
||||
|
||||
:execute
|
||||
@rem Setup the command line
|
||||
|
||||
set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
|
||||
|
||||
@rem Execute Gradle
|
||||
"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS%
|
||||
|
||||
:end
|
||||
@rem End local scope for the variables with windows NT shell
|
||||
if "%ERRORLEVEL%"=="0" goto mainEnd
|
||||
|
||||
:fail
|
||||
rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
|
||||
rem the _cmd.exe /c_ return code!
|
||||
if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
|
||||
exit /b 1
|
||||
|
||||
:mainEnd
|
||||
if "%OS%"=="Windows_NT" endlocal
|
||||
|
||||
:omega
|
||||
BIN
signavio/images/example.png
Normal file
BIN
signavio/images/example.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 216 KiB |
65500
signavio/samples/Activity_Log_2004_to_2014.csv
Normal file
65500
signavio/samples/Activity_Log_2004_to_2014.csv
Normal file
File diff suppressed because it is too large
Load Diff
1
signavio/settings.gradle
Normal file
1
signavio/settings.gradle
Normal file
@@ -0,0 +1 @@
|
||||
rootProject.name = 'process-intelligence'
|
||||
@@ -0,0 +1,27 @@
|
||||
package com.marqusm.signavio.processintelligence;
|
||||
|
||||
import com.marqusm.signavio.processintelligence.constant.AppConstant;
|
||||
import lombok.extern.slf4j.Slf4j;
|
||||
import lombok.val;
|
||||
import org.apache.commons.lang3.time.StopWatch;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 29-Oct-19
|
||||
*/
|
||||
@Slf4j
|
||||
class Main {
|
||||
public static void main(String[] args) {
|
||||
val processIntelligenceService =
|
||||
new ProcessIntelligence(Main.class.getResourceAsStream(AppConstant.ACTIVITY_RESOURCE_PATH));
|
||||
|
||||
val stopWatch = new StopWatch();
|
||||
stopWatch.start();
|
||||
|
||||
val topActivitiesJson = processIntelligenceService.getTopActivityVariantsJson();
|
||||
log.info(topActivitiesJson);
|
||||
|
||||
stopWatch.stop();
|
||||
log.info(String.format("Time: %dms", stopWatch.getTime()));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,97 @@
|
||||
package com.marqusm.signavio.processintelligence;
|
||||
|
||||
import com.google.gson.Gson;
|
||||
import com.google.gson.GsonBuilder;
|
||||
import com.marqusm.signavio.processintelligence.constant.AppConstant;
|
||||
import com.marqusm.signavio.processintelligence.model.db.Activity;
|
||||
import com.marqusm.signavio.processintelligence.model.dto.TopActivityVariants;
|
||||
import com.marqusm.signavio.processintelligence.model.dto.Variant;
|
||||
import com.marqusm.signavio.processintelligence.parser.base.ActivityCsvParser;
|
||||
import java.io.InputStream;
|
||||
import java.util.*;
|
||||
import java.util.stream.Collectors;
|
||||
import lombok.val;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 28-Oct-19
|
||||
*/
|
||||
class ProcessIntelligence {
|
||||
|
||||
private final Gson gson = new GsonBuilder().setPrettyPrinting().create();
|
||||
|
||||
private final List<Activity> activities;
|
||||
private final String topActivityVariantsJson;
|
||||
|
||||
public ProcessIntelligence(InputStream inputStream) {
|
||||
activities = readActivities(inputStream);
|
||||
topActivityVariantsJson = getTopActivityVariantsJson();
|
||||
}
|
||||
|
||||
public String getTopActivityVariantsJson() {
|
||||
return gson.toJson(getTopActivityVariants());
|
||||
}
|
||||
|
||||
String getCachedTopActivityVariants() {
|
||||
return topActivityVariantsJson;
|
||||
}
|
||||
|
||||
public TopActivityVariants getTopActivityVariants() {
|
||||
val activityMap = groupActivitiesByCase(activities);
|
||||
val caseCount = activityMap.size();
|
||||
val variantMap = calculateVariants(activityMap);
|
||||
val variantCount = variantMap.size();
|
||||
val topVariants = calculateTopVariants(variantMap, AppConstant.TOP_VARIANTS_COUNT);
|
||||
return TopActivityVariants.of(caseCount, variantCount, topVariants);
|
||||
}
|
||||
|
||||
private List<Activity> readActivities(InputStream inputStream) {
|
||||
try (val csvParser = ActivityCsvParser.of(inputStream)) {
|
||||
val activities = new LinkedList<Activity>();
|
||||
val iterator = csvParser.getIterator();
|
||||
iterator.forEachRemaining(activities::add);
|
||||
return activities;
|
||||
}
|
||||
}
|
||||
|
||||
private Map<String, List<Activity>> groupActivitiesByCase(List<Activity> activities) {
|
||||
val map = new HashMap<String, List<Activity>>();
|
||||
activities.forEach(
|
||||
activity -> {
|
||||
val caseActivities = map.computeIfAbsent(activity.getCaseId(), k -> new LinkedList<>());
|
||||
caseActivities.add(activity);
|
||||
});
|
||||
return map;
|
||||
}
|
||||
|
||||
private Map<List<String>, Integer> calculateVariants(Map<String, List<Activity>> activityMap) {
|
||||
final Map<List<String>, Integer> variantMap = new HashMap<>();
|
||||
activityMap.forEach(
|
||||
(key, value) -> {
|
||||
val variant = value.stream().map(Activity::getName).collect(Collectors.toList());
|
||||
var casesCount = variantMap.computeIfAbsent(variant, k -> 0);
|
||||
casesCount = casesCount + 1;
|
||||
variantMap.put(variant, casesCount);
|
||||
});
|
||||
return variantMap;
|
||||
}
|
||||
|
||||
@SuppressWarnings("SameParameterValue")
|
||||
private List<Variant> calculateTopVariants(
|
||||
Map<List<String>, Integer> variantMap, int topVariantsCount) {
|
||||
final TreeSet<Variant> topVariants =
|
||||
new TreeSet<>(
|
||||
Comparator.comparing(Variant::getCaseCount, Comparator.reverseOrder())
|
||||
.thenComparing((o1, o2) -> o1 == o2 ? 0 : 1));
|
||||
variantMap.forEach(
|
||||
(key, value) -> {
|
||||
if (topVariants.size() == topVariantsCount && value > topVariants.last().getCaseCount()) {
|
||||
topVariants.remove(topVariants.last());
|
||||
topVariants.add(Variant.of(key, value));
|
||||
} else if (topVariants.size() < topVariantsCount) {
|
||||
topVariants.add(Variant.of(key, value));
|
||||
}
|
||||
});
|
||||
return new ArrayList<>(topVariants);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,10 @@
|
||||
package com.marqusm.signavio.processintelligence.constant;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 29-Oct-19
|
||||
*/
|
||||
public class AppConstant {
|
||||
public static final String ACTIVITY_RESOURCE_PATH = "/samples/Activity_Log_2004_to_2014.csv";
|
||||
public static final int TOP_VARIANTS_COUNT = 10;
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
package com.marqusm.signavio.processintelligence.model.db;
|
||||
|
||||
import java.time.LocalDateTime;
|
||||
import lombok.*;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 28-Oct-19
|
||||
*/
|
||||
@AllArgsConstructor(staticName = "of")
|
||||
@Value
|
||||
public class Activity {
|
||||
private String caseId;
|
||||
private String name;
|
||||
private LocalDateTime localDateTime;
|
||||
}
|
||||
@@ -0,0 +1,17 @@
|
||||
package com.marqusm.signavio.processintelligence.model.dto;
|
||||
|
||||
import java.util.List;
|
||||
import lombok.AllArgsConstructor;
|
||||
import lombok.Value;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 28-Oct-19
|
||||
*/
|
||||
@AllArgsConstructor(staticName = "of")
|
||||
@Value
|
||||
public class TopActivityVariants {
|
||||
private int totalCaseCount;
|
||||
private int totalVariantsCount;
|
||||
private List<Variant> topVariants;
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
package com.marqusm.signavio.processintelligence.model.dto;
|
||||
|
||||
import java.util.List;
|
||||
import lombok.AllArgsConstructor;
|
||||
import lombok.Value;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 29-Oct-19
|
||||
*/
|
||||
@AllArgsConstructor(staticName = "of")
|
||||
@Value
|
||||
public class Variant {
|
||||
private List<String> activities;
|
||||
private int caseCount;
|
||||
}
|
||||
@@ -0,0 +1,75 @@
|
||||
package com.marqusm.signavio.processintelligence.parser.base;
|
||||
|
||||
import com.marqusm.signavio.processintelligence.model.db.Activity;
|
||||
import com.marqusm.signavio.processintelligence.util.CloseableUtil;
|
||||
import java.io.IOException;
|
||||
import java.io.InputStream;
|
||||
import java.io.InputStreamReader;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.time.LocalDateTime;
|
||||
import java.time.format.DateTimeFormatter;
|
||||
import java.util.Iterator;
|
||||
import java.util.Optional;
|
||||
import lombok.RequiredArgsConstructor;
|
||||
import org.apache.commons.csv.CSVFormat;
|
||||
import org.apache.commons.csv.CSVParser;
|
||||
import org.apache.commons.csv.CSVRecord;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 28-Oct-19
|
||||
*/
|
||||
@RequiredArgsConstructor(staticName = "of")
|
||||
public class ActivityCsvParser implements AutoCloseable {
|
||||
private static final String DATE_FORMAT = "yyyy-MM-dd HH:mm:ss.SSS";
|
||||
private static final DateTimeFormatter DATE_FORMATTER = DateTimeFormatter.ofPattern(DATE_FORMAT);
|
||||
|
||||
private final InputStream inputStream;
|
||||
|
||||
public Iterator<Activity> getIterator() {
|
||||
final Iterator<CSVRecord> csvIterator;
|
||||
try {
|
||||
CSVParser csvParser =
|
||||
new CSVParser(
|
||||
new InputStreamReader(inputStream, StandardCharsets.UTF_8),
|
||||
CSVFormat.DEFAULT
|
||||
.withDelimiter(';')
|
||||
.withFirstRecordAsHeader()
|
||||
.withIgnoreHeaderCase()
|
||||
.withTrim());
|
||||
csvIterator = csvParser.iterator();
|
||||
|
||||
return new Iterator<>() {
|
||||
|
||||
@Override
|
||||
public boolean hasNext() {
|
||||
return csvIterator.hasNext();
|
||||
}
|
||||
|
||||
@Override
|
||||
public Activity next() {
|
||||
var csvRecord = csvIterator.next();
|
||||
return Optional.ofNullable(csvRecord).map(r -> parseCSVRecord(r)).orElse(null);
|
||||
}
|
||||
};
|
||||
} catch (IOException e) {
|
||||
throw new IllegalStateException("Activities fetching failed.");
|
||||
}
|
||||
}
|
||||
|
||||
private Activity parseCSVRecord(CSVRecord csvRecord) {
|
||||
try {
|
||||
return Activity.of(
|
||||
csvRecord.get(0),
|
||||
csvRecord.get(1),
|
||||
LocalDateTime.parse(csvRecord.get(2), DATE_FORMATTER));
|
||||
} catch (Exception e) {
|
||||
throw new IllegalArgumentException("Illegal CSV file");
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
public void close() {
|
||||
CloseableUtil.silentClose(inputStream);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,27 @@
|
||||
package com.marqusm.signavio.processintelligence.util;
|
||||
|
||||
import java.io.Closeable;
|
||||
import java.io.IOException;
|
||||
import lombok.extern.slf4j.Slf4j;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 28-Oct-19
|
||||
*/
|
||||
@Slf4j
|
||||
public class CloseableUtil {
|
||||
@SuppressWarnings("WeakerAccess")
|
||||
public static void safeClose(Closeable closeable) throws IOException {
|
||||
if (closeable != null) {
|
||||
closeable.close();
|
||||
}
|
||||
}
|
||||
|
||||
public static void silentClose(Closeable closeable) {
|
||||
try {
|
||||
safeClose(closeable);
|
||||
} catch (IOException e) {
|
||||
log.warn("Closeable resource thrown exception while closing.", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
65500
signavio/src/main/resources/samples/Activity_Log_2004_to_2014.csv
Normal file
65500
signavio/src/main/resources/samples/Activity_Log_2004_to_2014.csv
Normal file
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,46 @@
|
||||
package com.marqusm.signavio.processintelligence;
|
||||
|
||||
import com.marqusm.signavio.processintelligence.constant.AppConstant;
|
||||
import java.util.Collections;
|
||||
import java.util.LinkedList;
|
||||
import java.util.List;
|
||||
import org.apache.commons.lang3.time.StopWatch;
|
||||
import org.assertj.core.api.Assertions;
|
||||
import org.junit.Test;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 29-Oct-19
|
||||
*/
|
||||
public class ProcessIntelligencePTest {
|
||||
|
||||
private static final int REPETITION_COUNT = 100;
|
||||
|
||||
@Test
|
||||
public void getTopActivityVariants() {
|
||||
ProcessIntelligence processIntelligence =
|
||||
new ProcessIntelligence(
|
||||
this.getClass().getResourceAsStream(AppConstant.ACTIVITY_RESOURCE_PATH));
|
||||
List<Long> processingTime = new LinkedList<>();
|
||||
|
||||
for (int i = 0; i < REPETITION_COUNT; i++) {
|
||||
StopWatch stopWatch = new StopWatch();
|
||||
stopWatch.start();
|
||||
processIntelligence.getTopActivityVariants();
|
||||
stopWatch.stop();
|
||||
processingTime.add(stopWatch.getTime());
|
||||
}
|
||||
|
||||
double averageTime =
|
||||
processingTime.stream()
|
||||
.mapToDouble(a -> a)
|
||||
.average()
|
||||
.orElseThrow(IllegalStateException::new);
|
||||
long maxTime = Collections.max(processingTime);
|
||||
long minTime = Collections.min(processingTime);
|
||||
System.out.println("Average time: " + averageTime);
|
||||
System.out.println("Max time: " + maxTime);
|
||||
System.out.println("Min time: " + minTime);
|
||||
Assertions.assertThat(averageTime).isLessThan(50.);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,53 @@
|
||||
package com.marqusm.signavio.processintelligence;
|
||||
|
||||
import com.marqusm.signavio.processintelligence.model.dto.TopActivityVariants;
|
||||
import org.assertj.core.api.Assertions;
|
||||
import org.junit.Test;
|
||||
|
||||
/**
|
||||
* @author : Marko
|
||||
* @createdOn : 29-Oct-19
|
||||
*/
|
||||
public class ProcessIntelligenceTest {
|
||||
|
||||
@Test
|
||||
public void testCachedResponse() {
|
||||
ProcessIntelligence processIntelligence =
|
||||
new ProcessIntelligence(
|
||||
this.getClass().getResourceAsStream("/samples/TestActivitiesCasesCount5.csv"));
|
||||
String result = processIntelligence.getCachedTopActivityVariants();
|
||||
Assertions.assertThat(result).isNotNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
public void test5Activities() {
|
||||
ProcessIntelligence processIntelligence =
|
||||
new ProcessIntelligence(
|
||||
this.getClass().getResourceAsStream("/samples/TestActivitiesTotalCount5.csv"));
|
||||
TopActivityVariants result = processIntelligence.getTopActivityVariants();
|
||||
Assertions.assertThat(result).isNotNull();
|
||||
Assertions.assertThat(result.getTotalCaseCount()).isEqualTo(2);
|
||||
Assertions.assertThat(result.getTotalVariantsCount()).isEqualTo(2);
|
||||
Assertions.assertThat(result.getTopVariants().size()).isEqualTo(2);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void test5Cases() {
|
||||
ProcessIntelligence processIntelligence =
|
||||
new ProcessIntelligence(
|
||||
this.getClass().getResourceAsStream("/samples/TestActivitiesCasesCount5.csv"));
|
||||
TopActivityVariants result = processIntelligence.getTopActivityVariants();
|
||||
Assertions.assertThat(result).isNotNull();
|
||||
Assertions.assertThat(result.getTotalCaseCount()).isEqualTo(5);
|
||||
Assertions.assertThat(result.getTotalVariantsCount()).isEqualTo(1);
|
||||
Assertions.assertThat(result.getTopVariants().size()).isEqualTo(1);
|
||||
}
|
||||
|
||||
@Test(expected = IllegalArgumentException.class)
|
||||
public void testIllegalFile() {
|
||||
ProcessIntelligence processIntelligence =
|
||||
new ProcessIntelligence(
|
||||
this.getClass().getResourceAsStream("/samples/TestActivitiesIllegalFile.csv"));
|
||||
processIntelligence.getTopActivityVariants();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
CaseID;ActivityName;Timestamp
|
||||
100430031000112060012015;Create FI invoice by vendor;2014-11-20 00:00:00.000
|
||||
100430031000112060012015;Post invoice in FI;2015-01-08 14:26:02.000
|
||||
100430031000112060012015;Clear open item;2015-01-12 23:59:59.000
|
||||
100430031000112070012015;Create FI invoice by vendor;2014-12-08 00:00:00.000
|
||||
100430031000112070012015;Post invoice in FI;2015-01-08 14:28:25.000
|
||||
100430031000112070012015;Clear open item;2015-01-12 23:59:59.000
|
||||
100430031000112080012015;Create FI invoice by vendor;2014-12-08 00:00:00.000
|
||||
100430031000112080012015;Post invoice in FI;2015-01-08 14:29:47.000
|
||||
100430031000112080012015;Clear open item;2015-01-12 23:59:59.000
|
||||
100430031000112100012015;Create FI invoice by vendor;2014-12-11 00:00:00.000
|
||||
100430031000112100012015;Post invoice in FI;2015-01-09 07:05:29.000
|
||||
100430031000112100012015;Clear open item;2015-01-27 23:59:59.000
|
||||
100430031000112110012015;Create FI invoice by vendor;2014-12-11 00:00:00.000
|
||||
100430031000112110012015;Post invoice in FI;2015-01-09 07:06:35.000
|
||||
100430031000112110012015;Clear open item;2015-01-27 23:59:59.000
|
||||
|
@@ -0,0 +1,6 @@
|
||||
CaseID,ActivityName,Timestamp
|
||||
100430031000112060012015,Create FI invoice by vendor,2014-11-20 00:00:00.000
|
||||
100430031000112060012015,Post invoice in FI,2015-01-08 14:26:02.000
|
||||
100430031000112060012015,Clear open item,2015-01-12 23:59:59.000
|
||||
100430031000112070012015,Create FI invoice by vendor,2014-12-08 00:00:00.000
|
||||
100430031000112070012015,Post invoice in FI,2015-01-08 14:28:25.000
|
||||
|
@@ -0,0 +1,6 @@
|
||||
CaseID;ActivityName;Timestamp
|
||||
100430031000112060012015;Create FI invoice by vendor;2014-11-20 00:00:00.000
|
||||
100430031000112060012015;Post invoice in FI;2015-01-08 14:26:02.000
|
||||
100430031000112060012015;Clear open item;2015-01-12 23:59:59.000
|
||||
100430031000112070012015;Create FI invoice by vendor;2014-12-08 00:00:00.000
|
||||
100430031000112070012015;Post invoice in FI;2015-01-08 14:28:25.000
|
||||
|
59
signavio/thought-process.md
Normal file
59
signavio/thought-process.md
Normal file
@@ -0,0 +1,59 @@
|
||||
Looks like I need to parse the file, process data while parsing and prepare it for fast reading.
|
||||
|
||||
Let's start with reading the data first.
|
||||
|
||||
One consideration, from the task details, I am not sure should I create a service for this,
|
||||
but since it says that other components need to read from this one, looks like web service is a valid option.
|
||||
I will create micro micro-service.
|
||||
|
||||
I will use standard web service code structure (controller, service, model).
|
||||
It is not necessary in this case, but if project grow up just a bit,
|
||||
it would be helpful to have clean design.
|
||||
|
||||
I will be using Lombok to avoid boilerplate code in controller, service and model classes.
|
||||
|
||||
Introducing ActivityCsvParser to read CSV file.
|
||||
Having enhancement so it is not only reading CSV, but collecting activities by Case ID.
|
||||
I am doing this to avoid multiple iteration through the list of parsed items.
|
||||
|
||||
I did intuitive way implementation with following steps:
|
||||
reading activities, grouping, finding all variants and finding top variants,
|
||||
but it is not performing well (takes around 150ms to finish).
|
||||
|
||||
After getting proper inner analysis, I noticed that ObjectMapper was taking more time than Gson
|
||||
for translating object to Json string, so I replaced it.
|
||||
|
||||
Also, I noticed that first run is taking much more time than any following run.
|
||||
That makes sense since in first run, Java is initializing all the necessary classes.
|
||||
Any following run is dramatically faster. And real case scenario would be
|
||||
this service to be running and other services to call it.
|
||||
So I created performance test "ProcessIntelligencePTest"
|
||||
where I am calling function 100 times and calculate the average time.
|
||||
|
||||
While looking for the best solution, I had idea to cache the response in the very beginning.
|
||||
That idea make service works much better in non cached scenario,
|
||||
since Java initialized all the classes necessary while running first caching method.
|
||||
I will leave it, just to make performance better.
|
||||
|
||||
Later on, I had to remove the feature of mapping cases while reading CSV file,
|
||||
since I am reading CSV file only once and use it as a database.
|
||||
I am doing this to make solution according to task description.
|
||||
|
||||
On second thinking about solution, I would say that "component"
|
||||
that would be interacting with this service might be another class,
|
||||
so I am removing micro-service approach and making this solution
|
||||
as a class ("ProcessIntelligence.java") to avoid over-engineering.
|
||||
It is very easy to create micro-service if necessary.
|
||||
|
||||
I copied provided CSV file into standard Java resource folder,
|
||||
where I believe it belongs, so if you want to change input data,
|
||||
please use resource folder for that.
|
||||
|
||||
### How to run
|
||||
You can run solution once, by executing command:
|
||||
`gradlew run` on Windows or `./gradlew run` on Linux
|
||||
|
||||
### Notes
|
||||
If you are changing any code, and you want to build the project (`gradlew clean build`),
|
||||
please run command `gradlew spotlessApply` before building
|
||||
since I am using plugin to avoid non formatted code to be pushed.
|
||||
Reference in New Issue
Block a user