Signavio challenge

This commit is contained in:
2025-05-10 16:24:09 +02:00
parent daf94080c8
commit 3648a66307
25 changed files with 131835 additions and 0 deletions

2
.gitignore vendored
View File

@@ -1 +1,3 @@
.idea/
.gradle/
build/

48
signavio/README.md Normal file
View File

@@ -0,0 +1,48 @@
# Signavio Backend Coding Challenge
Thank you very much for taking an interest in working at Signavio! After your first interview with us, we'd like to get to know your approach to problem solving with this coding challenge.
This challenge is actually closely related to the requirements of one of our products: Process Intelligence. This product enables Signavio customers to analyse big data generated from their processes and lets them recognize the gaps and variants from the intended business process (to-be) and the operating technical processes (as-is). Find out more about Process Intelligence [here](https://www.signavio.com/products/process-intelligence/)!
## The Goal
Please solve the challenge below by meeting the given acceptance criteria. Push your solution to this private GitHub repository and please do not publish your solution in any public GitHub repository or anywhere else!
Should you have any questions regarding the challenge, please email: coding.challenge.backend@signavio.com
In general, feel free to make assumptions and have fun with the challenge!
Please make sure you focus your time on explaining and elaborating your thought process, assumptions, decisions, thoughts and comments, for example inside a `thought-process.md` file. Please also document how to build and run your code.
## Tech Stack
* Java or Kotlin
* Gradle or Maven
* No external databases, do all calculations in memory to avoid complexity
* Test code where you feel it makes sense
## Details
A Signavio customer, a multinational company that builds industrial appliances, has an internal system dedicated to procuring (buying) any and all resources the company requires to operate. Procurement is done via the company's own ERP (Enterprise Resource Planning) system.
A typical business process represented the ERP system is "procure-to-pay", which generally involves the following activities:
* create purchase request
* request approved
* create purchase order
* select supplier
* receive goods
* pay invoice
Whenever the company wants to buy something, they do so through their ERP system.
The company buys many resources, always using their ERP system. Each resource purchase can be considered a case, or single instance of this process. As it happens, the actual as-is process often deviates from the ideal to-be process. Sometimes purchase requests are raised but never get approved, sometimes a supplier is selected but the goods are never received, sometimes it simply takes a long time to complete the process, and so on. We call these deviations from the process path variants.
The customer provides us with extracted process data from their existing ERP (Enterprise Resource Planning) system. The customer extracted one of their processes for analysis: Procure to Pay. The logfiles contain 3 columns:
* activity name
* case id
* timestamp
We want to analyse and compare process instances (cases) with each other. You can find the sample data set [here](samples/Activity_Log_2004_to_2014.csv).
## Acceptance Criteria
* Aggregate cases that have the same event execution order and list the 10 variants with the most cases.
* Provide your output as JSON, choose a structure that makes sense.
* As that output is used by other highly interactive components, we need to be able to get the query results in well under 50 milliseconds.
## Example
![Variants example](images/example.png)

51
signavio/build.gradle Normal file
View File

@@ -0,0 +1,51 @@
plugins {
id "application"
id 'java'
id 'com.diffplug.gradle.spotless' version '3.25.0' // code formatter
}
apply plugin: "java"
apply plugin: "com.diffplug.gradle.spotless"
group = 'com.marqusm.signavio'
version = '0.0.2'
sourceCompatibility = '11'
ext {
javaMainClass = "com.marqusm.signavio.processintelligence.Main"
}
application {
mainClassName = javaMainClass
}
configurations {
compileOnly {
extendsFrom annotationProcessor
}
}
spotless {
java {
googleJavaFormat()
}
}
repositories {
mavenCentral()
}
dependencies {
compile 'org.apache.commons:commons-csv:1.7'
compile 'com.google.code.gson:gson:2.8.6'
compile 'org.slf4j:slf4j-api:1.7.28'
compile 'ch.qos.logback:logback-classic:1.2.3'
compileOnly 'org.projectlombok:lombok:1.18.10'
annotationProcessor 'org.projectlombok:lombok:1.18.10'
implementation 'org.apache.commons:commons-lang3:3.9'
implementation 'com.google.guava:guava:28.0-jre'
testCompile 'junit:junit:4.12'
testImplementation 'org.assertj:assertj-core:3.13.2'
}

Binary file not shown.

View File

@@ -0,0 +1,6 @@
#Fri Apr 19 10:17:24 CEST 2019
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists
distributionUrl=https\://services.gradle.org/distributions/gradle-5.6.3-all.zip

172
signavio/gradlew vendored Normal file
View File

@@ -0,0 +1,172 @@
#!/usr/bin/env sh
##############################################################################
##
## Gradle start up script for UN*X
##
##############################################################################
# Attempt to set APP_HOME
# Resolve links: $0 may be a link
PRG="$0"
# Need this for relative symlinks.
while [ -h "$PRG" ] ; do
ls=`ls -ld "$PRG"`
link=`expr "$ls" : '.*-> \(.*\)$'`
if expr "$link" : '/.*' > /dev/null; then
PRG="$link"
else
PRG=`dirname "$PRG"`"/$link"
fi
done
SAVED="`pwd`"
cd "`dirname \"$PRG\"`/" >/dev/null
APP_HOME="`pwd -P`"
cd "$SAVED" >/dev/null
APP_NAME="Gradle"
APP_BASE_NAME=`basename "$0"`
# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
DEFAULT_JVM_OPTS=""
# Use the maximum available, or set MAX_FD != -1 to use that value.
MAX_FD="maximum"
warn () {
echo "$*"
}
die () {
echo
echo "$*"
echo
exit 1
}
# OS specific support (must be 'true' or 'false').
cygwin=false
msys=false
darwin=false
nonstop=false
case "`uname`" in
CYGWIN* )
cygwin=true
;;
Darwin* )
darwin=true
;;
MINGW* )
msys=true
;;
NONSTOP* )
nonstop=true
;;
esac
CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar
# Determine the Java command to use to start the JVM.
if [ -n "$JAVA_HOME" ] ; then
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
# IBM's JDK on AIX uses strange locations for the executables
JAVACMD="$JAVA_HOME/jre/sh/java"
else
JAVACMD="$JAVA_HOME/bin/java"
fi
if [ ! -x "$JAVACMD" ] ; then
die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME
Please set the JAVA_HOME variable in your environment to match the
location of your Java installation."
fi
else
JAVACMD="java"
which java >/dev/null 2>&1 || die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
Please set the JAVA_HOME variable in your environment to match the
location of your Java installation."
fi
# Increase the maximum file descriptors if we can.
if [ "$cygwin" = "false" -a "$darwin" = "false" -a "$nonstop" = "false" ] ; then
MAX_FD_LIMIT=`ulimit -H -n`
if [ $? -eq 0 ] ; then
if [ "$MAX_FD" = "maximum" -o "$MAX_FD" = "max" ] ; then
MAX_FD="$MAX_FD_LIMIT"
fi
ulimit -n $MAX_FD
if [ $? -ne 0 ] ; then
warn "Could not set maximum file descriptor limit: $MAX_FD"
fi
else
warn "Could not query maximum file descriptor limit: $MAX_FD_LIMIT"
fi
fi
# For Darwin, add options to specify how the application appears in the dock
if $darwin; then
GRADLE_OPTS="$GRADLE_OPTS \"-Xdock:name=$APP_NAME\" \"-Xdock:icon=$APP_HOME/media/gradle.icns\""
fi
# For Cygwin, switch paths to Windows format before running java
if $cygwin ; then
APP_HOME=`cygpath --path --mixed "$APP_HOME"`
CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
JAVACMD=`cygpath --unix "$JAVACMD"`
# We build the pattern for arguments to be converted via cygpath
ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`
SEP=""
for dir in $ROOTDIRSRAW ; do
ROOTDIRS="$ROOTDIRS$SEP$dir"
SEP="|"
done
OURCYGPATTERN="(^($ROOTDIRS))"
# Add a user-defined pattern to the cygpath arguments
if [ "$GRADLE_CYGPATTERN" != "" ] ; then
OURCYGPATTERN="$OURCYGPATTERN|($GRADLE_CYGPATTERN)"
fi
# Now convert the arguments - kludge to limit ourselves to /bin/sh
i=0
for arg in "$@" ; do
CHECK=`echo "$arg"|egrep -c "$OURCYGPATTERN" -`
CHECK2=`echo "$arg"|egrep -c "^-"` ### Determine if an option
if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then ### Added a condition
eval `echo args$i`=`cygpath --path --ignore --mixed "$arg"`
else
eval `echo args$i`="\"$arg\""
fi
i=$((i+1))
done
case $i in
(0) set -- ;;
(1) set -- "$args0" ;;
(2) set -- "$args0" "$args1" ;;
(3) set -- "$args0" "$args1" "$args2" ;;
(4) set -- "$args0" "$args1" "$args2" "$args3" ;;
(5) set -- "$args0" "$args1" "$args2" "$args3" "$args4" ;;
(6) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" ;;
(7) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" ;;
(8) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" ;;
(9) set -- "$args0" "$args1" "$args2" "$args3" "$args4" "$args5" "$args6" "$args7" "$args8" ;;
esac
fi
# Escape application args
save () {
for i do printf %s\\n "$i" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/' \\\\/" ; done
echo " "
}
APP_ARGS=$(save "$@")
# Collect all arguments for the java command, following the shell quoting and substitution rules
eval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS "\"-Dorg.gradle.appname=$APP_BASE_NAME\"" -classpath "\"$CLASSPATH\"" org.gradle.wrapper.GradleWrapperMain "$APP_ARGS"
# by default we should be in the correct project dir, but when run from Finder on Mac, the cwd is wrong
if [ "$(uname)" = "Darwin" ] && [ "$HOME" = "$PWD" ]; then
cd "$(dirname "$0")"
fi
exec "$JAVACMD" "$@"

84
signavio/gradlew.bat vendored Normal file
View File

@@ -0,0 +1,84 @@
@if "%DEBUG%" == "" @echo off
@rem ##########################################################################
@rem
@rem Gradle startup script for Windows
@rem
@rem ##########################################################################
@rem Set local scope for the variables with windows NT shell
if "%OS%"=="Windows_NT" setlocal
set DIRNAME=%~dp0
if "%DIRNAME%" == "" set DIRNAME=.
set APP_BASE_NAME=%~n0
set APP_HOME=%DIRNAME%
@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
set DEFAULT_JVM_OPTS=
@rem Find java.exe
if defined JAVA_HOME goto findJavaFromJavaHome
set JAVA_EXE=java.exe
%JAVA_EXE% -version >NUL 2>&1
if "%ERRORLEVEL%" == "0" goto init
echo.
echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
echo.
echo Please set the JAVA_HOME variable in your environment to match the
echo location of your Java installation.
goto fail
:findJavaFromJavaHome
set JAVA_HOME=%JAVA_HOME:"=%
set JAVA_EXE=%JAVA_HOME%/bin/java.exe
if exist "%JAVA_EXE%" goto init
echo.
echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%
echo.
echo Please set the JAVA_HOME variable in your environment to match the
echo location of your Java installation.
goto fail
:init
@rem Get command-line arguments, handling Windows variants
if not "%OS%" == "Windows_NT" goto win9xME_args
:win9xME_args
@rem Slurp the command line arguments.
set CMD_LINE_ARGS=
set _SKIP=2
:win9xME_args_slurp
if "x%~1" == "x" goto execute
set CMD_LINE_ARGS=%*
:execute
@rem Setup the command line
set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar
@rem Execute Gradle
"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %CMD_LINE_ARGS%
:end
@rem End local scope for the variables with windows NT shell
if "%ERRORLEVEL%"=="0" goto mainEnd
:fail
rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of
rem the _cmd.exe /c_ return code!
if not "" == "%GRADLE_EXIT_CONSOLE%" exit 1
exit /b 1
:mainEnd
if "%OS%"=="Windows_NT" endlocal
:omega

BIN
signavio/images/example.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

File diff suppressed because it is too large Load Diff

1
signavio/settings.gradle Normal file
View File

@@ -0,0 +1 @@
rootProject.name = 'process-intelligence'

View File

@@ -0,0 +1,27 @@
package com.marqusm.signavio.processintelligence;
import com.marqusm.signavio.processintelligence.constant.AppConstant;
import lombok.extern.slf4j.Slf4j;
import lombok.val;
import org.apache.commons.lang3.time.StopWatch;
/**
* @author : Marko
* @createdOn : 29-Oct-19
*/
@Slf4j
class Main {
public static void main(String[] args) {
val processIntelligenceService =
new ProcessIntelligence(Main.class.getResourceAsStream(AppConstant.ACTIVITY_RESOURCE_PATH));
val stopWatch = new StopWatch();
stopWatch.start();
val topActivitiesJson = processIntelligenceService.getTopActivityVariantsJson();
log.info(topActivitiesJson);
stopWatch.stop();
log.info(String.format("Time: %dms", stopWatch.getTime()));
}
}

View File

@@ -0,0 +1,97 @@
package com.marqusm.signavio.processintelligence;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.marqusm.signavio.processintelligence.constant.AppConstant;
import com.marqusm.signavio.processintelligence.model.db.Activity;
import com.marqusm.signavio.processintelligence.model.dto.TopActivityVariants;
import com.marqusm.signavio.processintelligence.model.dto.Variant;
import com.marqusm.signavio.processintelligence.parser.base.ActivityCsvParser;
import java.io.InputStream;
import java.util.*;
import java.util.stream.Collectors;
import lombok.val;
/**
* @author : Marko
* @createdOn : 28-Oct-19
*/
class ProcessIntelligence {
private final Gson gson = new GsonBuilder().setPrettyPrinting().create();
private final List<Activity> activities;
private final String topActivityVariantsJson;
public ProcessIntelligence(InputStream inputStream) {
activities = readActivities(inputStream);
topActivityVariantsJson = getTopActivityVariantsJson();
}
public String getTopActivityVariantsJson() {
return gson.toJson(getTopActivityVariants());
}
String getCachedTopActivityVariants() {
return topActivityVariantsJson;
}
public TopActivityVariants getTopActivityVariants() {
val activityMap = groupActivitiesByCase(activities);
val caseCount = activityMap.size();
val variantMap = calculateVariants(activityMap);
val variantCount = variantMap.size();
val topVariants = calculateTopVariants(variantMap, AppConstant.TOP_VARIANTS_COUNT);
return TopActivityVariants.of(caseCount, variantCount, topVariants);
}
private List<Activity> readActivities(InputStream inputStream) {
try (val csvParser = ActivityCsvParser.of(inputStream)) {
val activities = new LinkedList<Activity>();
val iterator = csvParser.getIterator();
iterator.forEachRemaining(activities::add);
return activities;
}
}
private Map<String, List<Activity>> groupActivitiesByCase(List<Activity> activities) {
val map = new HashMap<String, List<Activity>>();
activities.forEach(
activity -> {
val caseActivities = map.computeIfAbsent(activity.getCaseId(), k -> new LinkedList<>());
caseActivities.add(activity);
});
return map;
}
private Map<List<String>, Integer> calculateVariants(Map<String, List<Activity>> activityMap) {
final Map<List<String>, Integer> variantMap = new HashMap<>();
activityMap.forEach(
(key, value) -> {
val variant = value.stream().map(Activity::getName).collect(Collectors.toList());
var casesCount = variantMap.computeIfAbsent(variant, k -> 0);
casesCount = casesCount + 1;
variantMap.put(variant, casesCount);
});
return variantMap;
}
@SuppressWarnings("SameParameterValue")
private List<Variant> calculateTopVariants(
Map<List<String>, Integer> variantMap, int topVariantsCount) {
final TreeSet<Variant> topVariants =
new TreeSet<>(
Comparator.comparing(Variant::getCaseCount, Comparator.reverseOrder())
.thenComparing((o1, o2) -> o1 == o2 ? 0 : 1));
variantMap.forEach(
(key, value) -> {
if (topVariants.size() == topVariantsCount && value > topVariants.last().getCaseCount()) {
topVariants.remove(topVariants.last());
topVariants.add(Variant.of(key, value));
} else if (topVariants.size() < topVariantsCount) {
topVariants.add(Variant.of(key, value));
}
});
return new ArrayList<>(topVariants);
}
}

View File

@@ -0,0 +1,10 @@
package com.marqusm.signavio.processintelligence.constant;
/**
* @author : Marko
* @createdOn : 29-Oct-19
*/
public class AppConstant {
public static final String ACTIVITY_RESOURCE_PATH = "/samples/Activity_Log_2004_to_2014.csv";
public static final int TOP_VARIANTS_COUNT = 10;
}

View File

@@ -0,0 +1,16 @@
package com.marqusm.signavio.processintelligence.model.db;
import java.time.LocalDateTime;
import lombok.*;
/**
* @author : Marko
* @createdOn : 28-Oct-19
*/
@AllArgsConstructor(staticName = "of")
@Value
public class Activity {
private String caseId;
private String name;
private LocalDateTime localDateTime;
}

View File

@@ -0,0 +1,17 @@
package com.marqusm.signavio.processintelligence.model.dto;
import java.util.List;
import lombok.AllArgsConstructor;
import lombok.Value;
/**
* @author : Marko
* @createdOn : 28-Oct-19
*/
@AllArgsConstructor(staticName = "of")
@Value
public class TopActivityVariants {
private int totalCaseCount;
private int totalVariantsCount;
private List<Variant> topVariants;
}

View File

@@ -0,0 +1,16 @@
package com.marqusm.signavio.processintelligence.model.dto;
import java.util.List;
import lombok.AllArgsConstructor;
import lombok.Value;
/**
* @author : Marko
* @createdOn : 29-Oct-19
*/
@AllArgsConstructor(staticName = "of")
@Value
public class Variant {
private List<String> activities;
private int caseCount;
}

View File

@@ -0,0 +1,75 @@
package com.marqusm.signavio.processintelligence.parser.base;
import com.marqusm.signavio.processintelligence.model.db.Activity;
import com.marqusm.signavio.processintelligence.util.CloseableUtil;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.Iterator;
import java.util.Optional;
import lombok.RequiredArgsConstructor;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
/**
* @author : Marko
* @createdOn : 28-Oct-19
*/
@RequiredArgsConstructor(staticName = "of")
public class ActivityCsvParser implements AutoCloseable {
private static final String DATE_FORMAT = "yyyy-MM-dd HH:mm:ss.SSS";
private static final DateTimeFormatter DATE_FORMATTER = DateTimeFormatter.ofPattern(DATE_FORMAT);
private final InputStream inputStream;
public Iterator<Activity> getIterator() {
final Iterator<CSVRecord> csvIterator;
try {
CSVParser csvParser =
new CSVParser(
new InputStreamReader(inputStream, StandardCharsets.UTF_8),
CSVFormat.DEFAULT
.withDelimiter(';')
.withFirstRecordAsHeader()
.withIgnoreHeaderCase()
.withTrim());
csvIterator = csvParser.iterator();
return new Iterator<>() {
@Override
public boolean hasNext() {
return csvIterator.hasNext();
}
@Override
public Activity next() {
var csvRecord = csvIterator.next();
return Optional.ofNullable(csvRecord).map(r -> parseCSVRecord(r)).orElse(null);
}
};
} catch (IOException e) {
throw new IllegalStateException("Activities fetching failed.");
}
}
private Activity parseCSVRecord(CSVRecord csvRecord) {
try {
return Activity.of(
csvRecord.get(0),
csvRecord.get(1),
LocalDateTime.parse(csvRecord.get(2), DATE_FORMATTER));
} catch (Exception e) {
throw new IllegalArgumentException("Illegal CSV file");
}
}
@Override
public void close() {
CloseableUtil.silentClose(inputStream);
}
}

View File

@@ -0,0 +1,27 @@
package com.marqusm.signavio.processintelligence.util;
import java.io.Closeable;
import java.io.IOException;
import lombok.extern.slf4j.Slf4j;
/**
* @author : Marko
* @createdOn : 28-Oct-19
*/
@Slf4j
public class CloseableUtil {
@SuppressWarnings("WeakerAccess")
public static void safeClose(Closeable closeable) throws IOException {
if (closeable != null) {
closeable.close();
}
}
public static void silentClose(Closeable closeable) {
try {
safeClose(closeable);
} catch (IOException e) {
log.warn("Closeable resource thrown exception while closing.", e);
}
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,46 @@
package com.marqusm.signavio.processintelligence;
import com.marqusm.signavio.processintelligence.constant.AppConstant;
import java.util.Collections;
import java.util.LinkedList;
import java.util.List;
import org.apache.commons.lang3.time.StopWatch;
import org.assertj.core.api.Assertions;
import org.junit.Test;
/**
* @author : Marko
* @createdOn : 29-Oct-19
*/
public class ProcessIntelligencePTest {
private static final int REPETITION_COUNT = 100;
@Test
public void getTopActivityVariants() {
ProcessIntelligence processIntelligence =
new ProcessIntelligence(
this.getClass().getResourceAsStream(AppConstant.ACTIVITY_RESOURCE_PATH));
List<Long> processingTime = new LinkedList<>();
for (int i = 0; i < REPETITION_COUNT; i++) {
StopWatch stopWatch = new StopWatch();
stopWatch.start();
processIntelligence.getTopActivityVariants();
stopWatch.stop();
processingTime.add(stopWatch.getTime());
}
double averageTime =
processingTime.stream()
.mapToDouble(a -> a)
.average()
.orElseThrow(IllegalStateException::new);
long maxTime = Collections.max(processingTime);
long minTime = Collections.min(processingTime);
System.out.println("Average time: " + averageTime);
System.out.println("Max time: " + maxTime);
System.out.println("Min time: " + minTime);
Assertions.assertThat(averageTime).isLessThan(50.);
}
}

View File

@@ -0,0 +1,53 @@
package com.marqusm.signavio.processintelligence;
import com.marqusm.signavio.processintelligence.model.dto.TopActivityVariants;
import org.assertj.core.api.Assertions;
import org.junit.Test;
/**
* @author : Marko
* @createdOn : 29-Oct-19
*/
public class ProcessIntelligenceTest {
@Test
public void testCachedResponse() {
ProcessIntelligence processIntelligence =
new ProcessIntelligence(
this.getClass().getResourceAsStream("/samples/TestActivitiesCasesCount5.csv"));
String result = processIntelligence.getCachedTopActivityVariants();
Assertions.assertThat(result).isNotNull();
}
@Test
public void test5Activities() {
ProcessIntelligence processIntelligence =
new ProcessIntelligence(
this.getClass().getResourceAsStream("/samples/TestActivitiesTotalCount5.csv"));
TopActivityVariants result = processIntelligence.getTopActivityVariants();
Assertions.assertThat(result).isNotNull();
Assertions.assertThat(result.getTotalCaseCount()).isEqualTo(2);
Assertions.assertThat(result.getTotalVariantsCount()).isEqualTo(2);
Assertions.assertThat(result.getTopVariants().size()).isEqualTo(2);
}
@Test
public void test5Cases() {
ProcessIntelligence processIntelligence =
new ProcessIntelligence(
this.getClass().getResourceAsStream("/samples/TestActivitiesCasesCount5.csv"));
TopActivityVariants result = processIntelligence.getTopActivityVariants();
Assertions.assertThat(result).isNotNull();
Assertions.assertThat(result.getTotalCaseCount()).isEqualTo(5);
Assertions.assertThat(result.getTotalVariantsCount()).isEqualTo(1);
Assertions.assertThat(result.getTopVariants().size()).isEqualTo(1);
}
@Test(expected = IllegalArgumentException.class)
public void testIllegalFile() {
ProcessIntelligence processIntelligence =
new ProcessIntelligence(
this.getClass().getResourceAsStream("/samples/TestActivitiesIllegalFile.csv"));
processIntelligence.getTopActivityVariants();
}
}

View File

@@ -0,0 +1,16 @@
CaseID;ActivityName;Timestamp
100430031000112060012015;Create FI invoice by vendor;2014-11-20 00:00:00.000
100430031000112060012015;Post invoice in FI;2015-01-08 14:26:02.000
100430031000112060012015;Clear open item;2015-01-12 23:59:59.000
100430031000112070012015;Create FI invoice by vendor;2014-12-08 00:00:00.000
100430031000112070012015;Post invoice in FI;2015-01-08 14:28:25.000
100430031000112070012015;Clear open item;2015-01-12 23:59:59.000
100430031000112080012015;Create FI invoice by vendor;2014-12-08 00:00:00.000
100430031000112080012015;Post invoice in FI;2015-01-08 14:29:47.000
100430031000112080012015;Clear open item;2015-01-12 23:59:59.000
100430031000112100012015;Create FI invoice by vendor;2014-12-11 00:00:00.000
100430031000112100012015;Post invoice in FI;2015-01-09 07:05:29.000
100430031000112100012015;Clear open item;2015-01-27 23:59:59.000
100430031000112110012015;Create FI invoice by vendor;2014-12-11 00:00:00.000
100430031000112110012015;Post invoice in FI;2015-01-09 07:06:35.000
100430031000112110012015;Clear open item;2015-01-27 23:59:59.000
1 CaseID ActivityName Timestamp
2 100430031000112060012015 Create FI invoice by vendor 2014-11-20 00:00:00.000
3 100430031000112060012015 Post invoice in FI 2015-01-08 14:26:02.000
4 100430031000112060012015 Clear open item 2015-01-12 23:59:59.000
5 100430031000112070012015 Create FI invoice by vendor 2014-12-08 00:00:00.000
6 100430031000112070012015 Post invoice in FI 2015-01-08 14:28:25.000
7 100430031000112070012015 Clear open item 2015-01-12 23:59:59.000
8 100430031000112080012015 Create FI invoice by vendor 2014-12-08 00:00:00.000
9 100430031000112080012015 Post invoice in FI 2015-01-08 14:29:47.000
10 100430031000112080012015 Clear open item 2015-01-12 23:59:59.000
11 100430031000112100012015 Create FI invoice by vendor 2014-12-11 00:00:00.000
12 100430031000112100012015 Post invoice in FI 2015-01-09 07:05:29.000
13 100430031000112100012015 Clear open item 2015-01-27 23:59:59.000
14 100430031000112110012015 Create FI invoice by vendor 2014-12-11 00:00:00.000
15 100430031000112110012015 Post invoice in FI 2015-01-09 07:06:35.000
16 100430031000112110012015 Clear open item 2015-01-27 23:59:59.000

View File

@@ -0,0 +1,6 @@
CaseID,ActivityName,Timestamp
100430031000112060012015,Create FI invoice by vendor,2014-11-20 00:00:00.000
100430031000112060012015,Post invoice in FI,2015-01-08 14:26:02.000
100430031000112060012015,Clear open item,2015-01-12 23:59:59.000
100430031000112070012015,Create FI invoice by vendor,2014-12-08 00:00:00.000
100430031000112070012015,Post invoice in FI,2015-01-08 14:28:25.000
1 CaseID ActivityName Timestamp
2 100430031000112060012015 Create FI invoice by vendor 2014-11-20 00:00:00.000
3 100430031000112060012015 Post invoice in FI 2015-01-08 14:26:02.000
4 100430031000112060012015 Clear open item 2015-01-12 23:59:59.000
5 100430031000112070012015 Create FI invoice by vendor 2014-12-08 00:00:00.000
6 100430031000112070012015 Post invoice in FI 2015-01-08 14:28:25.000

View File

@@ -0,0 +1,6 @@
CaseID;ActivityName;Timestamp
100430031000112060012015;Create FI invoice by vendor;2014-11-20 00:00:00.000
100430031000112060012015;Post invoice in FI;2015-01-08 14:26:02.000
100430031000112060012015;Clear open item;2015-01-12 23:59:59.000
100430031000112070012015;Create FI invoice by vendor;2014-12-08 00:00:00.000
100430031000112070012015;Post invoice in FI;2015-01-08 14:28:25.000
1 CaseID ActivityName Timestamp
2 100430031000112060012015 Create FI invoice by vendor 2014-11-20 00:00:00.000
3 100430031000112060012015 Post invoice in FI 2015-01-08 14:26:02.000
4 100430031000112060012015 Clear open item 2015-01-12 23:59:59.000
5 100430031000112070012015 Create FI invoice by vendor 2014-12-08 00:00:00.000
6 100430031000112070012015 Post invoice in FI 2015-01-08 14:28:25.000

View File

@@ -0,0 +1,59 @@
Looks like I need to parse the file, process data while parsing and prepare it for fast reading.
Let's start with reading the data first.
One consideration, from the task details, I am not sure should I create a service for this,
but since it says that other components need to read from this one, looks like web service is a valid option.
I will create micro micro-service.
I will use standard web service code structure (controller, service, model).
It is not necessary in this case, but if project grow up just a bit,
it would be helpful to have clean design.
I will be using Lombok to avoid boilerplate code in controller, service and model classes.
Introducing ActivityCsvParser to read CSV file.
Having enhancement so it is not only reading CSV, but collecting activities by Case ID.
I am doing this to avoid multiple iteration through the list of parsed items.
I did intuitive way implementation with following steps:
reading activities, grouping, finding all variants and finding top variants,
but it is not performing well (takes around 150ms to finish).
After getting proper inner analysis, I noticed that ObjectMapper was taking more time than Gson
for translating object to Json string, so I replaced it.
Also, I noticed that first run is taking much more time than any following run.
That makes sense since in first run, Java is initializing all the necessary classes.
Any following run is dramatically faster. And real case scenario would be
this service to be running and other services to call it.
So I created performance test "ProcessIntelligencePTest"
where I am calling function 100 times and calculate the average time.
While looking for the best solution, I had idea to cache the response in the very beginning.
That idea make service works much better in non cached scenario,
since Java initialized all the classes necessary while running first caching method.
I will leave it, just to make performance better.
Later on, I had to remove the feature of mapping cases while reading CSV file,
since I am reading CSV file only once and use it as a database.
I am doing this to make solution according to task description.
On second thinking about solution, I would say that "component"
that would be interacting with this service might be another class,
so I am removing micro-service approach and making this solution
as a class ("ProcessIntelligence.java") to avoid over-engineering.
It is very easy to create micro-service if necessary.
I copied provided CSV file into standard Java resource folder,
where I believe it belongs, so if you want to change input data,
please use resource folder for that.
### How to run
You can run solution once, by executing command:
`gradlew run` on Windows or `./gradlew run` on Linux
### Notes
If you are changing any code, and you want to build the project (`gradlew clean build`),
please run command `gradlew spotlessApply` before building
since I am using plugin to avoid non formatted code to be pushed.