Set up

On this page you will find all the elements required to reproduce the elements of the paper "YARBUS:Yet another rule-based belief update system". For any information about what you find in the repository or about the paper, please contact Jeremy Fix @jeremyfix .

First, some general comments, to get all the elements of the repository :


	    cd your_repo_root
	    git clone https://github.com/jeremyfix/dstc.git
	    cd dstc
	    mkdir build
	    cd build
	    ------ 
	    cmake .. -DDSTC_Viewer=ON && make
	    --OR-- 
	    cmake .. && make  // to avoid building the viewer which requires Qt
	    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./src
	    // You are ready to play with the tools of the package
	

You can play around with :

All the elements of the package are somehow dedicated to process data from the Dialog State Tracking Challenge 2 and 3 (at least)

Yarbus

Basic usage

Yarbus is a simple rule based belief tracker you can run on the dialogues from the DSTC2 and DSTC3. In the repository, Yarbus is provided as several scripts :

Basically, to run the tracker with a threshold of the belief of 0.01 on DSTC2 :


./examples/yarbus ~/DSTC/scripts/config/dstc2_test.flist ~/DSTC/scripts/config/ontology_dstc2.json 0.01 0
If you want to see the output of the tracker on a specific dialog, say voip-28848c294, you simply call :

./examples/yarbus ~/DSTC/scripts/config/dstc2_test.flist ~/DSTC/scripts/config/ontology_dstc2.json 0.01 1 voip-28848c294 
The other scripts take more or less the same parameters, just run it, there is an help message.

Varying the rule set

In the paper, we varied the set of rules being used. This can be achieved with yarbus-rule-set. You need to specify a rule set defined as an integer that we convert as a binary code for selecting the rules. There are five rules, let us denote the rule set number as r0r1r2r3r4. The ri in {0, 1} indicates if we activate or not a rule with :

As we saw in experiments, most of the score is achieved by activating only the inform and expl-conf rules (i.e. rule number 24) and if, in addition you set the belief threshold to 0.01, so~:


./examples/yarbus-rule-set ~/DSTC/scripts/config/dstc2_test.flist ~/DSTC/scripts/config/ontology_dstc2.json 0.01 0 24

The generated tracker output can then be scored with the scoring functions of the challenges; something like :

python score.py --dataset dstc2_test --dataroot ~/DSTC/data --ontology ~/DSTC/scripts/config/ontology_dstc2.json --trackfile dstc2_test-output.json --scorefile scores.csv 
python report.py --scorefile scores.csv
On the LIVE SLU, you should get, on the joint goals, something like (Acc : 0.7247394; L2 : 0.4413819; ROC : 0.0001424). On the SJTU+sys SLU, the performances are better (Acc : 0.7592115; L2 : 0.3618278; ROC : 0.3189233). Actually, you could even slightly increase the accuracy to 76.2 by blocking the negate rule in case the user is informing a slot for which the expl-conf is considered. But that is only 0.3 %, so let us keep it simple.

A port in python

In the repository, there is a python script baseline-yarbus.py which follows the same structure than the other baselines submitted to the challenges. However, be warned that the python script can much slower than the C++ code (because of some inefficient data structures and this slowness is especially true on SLU with a lot of hypothesis). In short, we advise you to use the C++ script. Anyway, as in the C++ script you can specifiy the rule set and the threshold to be used for prunning the belief. For example, prunning at 0.01 and using the expl-conf and inform rule, you would call something like (this call might vary depending on the paths of the data):

python baseline_yarbus.py --rule_set 24 --thr_belief 0.01 --dataset dstc2_test --dataroot ../data --ontology config/ontology_dstc2.json --trackfile yarbus.json 
On the SJTU+sys SLU, you should get (Acc:0.7579730, L2:0.3678179, ROC:0.3206699). On the LIVE SLU, the performances are actually (Acc:0.7231912, L2:0.4457830,ROC:0.0001427).

Statistics

In order to check out which dialog acts are used and with slot/value pairs, you can run the src/statistics scripts. It generates latex outputs ready to be compiled two times with pdflatex. It basically generates histograms on the acts and the slot/value pairs involved.

The Viewer

Just launch viewer as src/viewer, load an ontology, dialog filelists, tracker output if you wish and explore the data.