Skip to content

Input files

philippelucarelli edited this page Sep 14, 2016 · 9 revisions

Two input files are needed to create an optimization problem in FALCON. One describes the network structure and the other describes the experimental conditions and their associated experimental data. Both files can either be in the Excel format (.xls or .xlsx) or in the text format (.txt).

1. Model description

In the text format, the network model is defined by a tab-separated table where each line defines one interaction. The network interactions are defined in the first 3 columns: the first column defines the name of the source node, the second one defines the type of the interaction (‘->’ for activation and ‘-|’ for inhibition) and the third one defines the name of the target node, similar to the simple interaction file (.sif) format. Additional arguments for each interaction are defined in the fourth, fifth, and sixth columns. The name of the parameter associated with the interaction is assigned in the fourth column. The fifth column defines the type of linear and non-linear interactions which can be defined by Boolean gates i.e. ‘A’ for the AND gate, ‘O’ for the OR gate, and ‘N’ for no Boolean gate which represent the convergence, redundancy and addition of signals, respectively. The last (sixth) column then defines the parameter bounds of the optimised parameters which could be ‘D’ (default range from 0 to 1), ‘H’ (high bound) and ‘L’ (low bound). The cut-off value between the high and low parameter bounds can be defined on the variable “HLbound” (default value = 0.5). The assignments of high and low bounds can be applied once there exists prior knowledge on the relevance or the importance of interactions in the network. For instance, the signals from canonical pathways are usually stronger than the ones from crosstalk interactions, so the bounds of the network interactions which represent the canonical pathway and the crosstalk interaction could be assigned as ‘H’ and ‘L’, respectively.

In the Excel format, users can define the descriptions of each network interaction in the same manner as in the text format. Columns have titles which are not included into the model description.

It is possible to specify Boolean functions that comprise more than two inputs as these will be automatically expanded to their simple form by the toolbox.

2. Data description

In the text format (see Toy example), the data file is organised into 3 tab-separated columns. The first column defines the experimental conditions i.e. the combinations of inputs into the system. Each of the input and their associated state value (from 0 to 1) are separated by a comma (,) and the assignment of the following input is also separated by a comma (,) in the first column. Then, users can define the experimental readouts/data of the output nodes in the second column with the same format as the input(s). If there exist the variation indicator of the data e.g. standard deviations (S.D.), the users can also define them in the third column with the same format as the previous columns. Note that the third column can be omitted if the variance of the data is not known e.g. in a single replicate experiment.

In the Excel format, there exist 3 datasheets: ‘input’, ‘output’, and ‘error’. Users can alternatively define the name of the inputs on the first row, then define the values of the input(s) for each experimental condition in the successive rows within the ‘input’ datasheet. The same data assignment format applies also for the ‘output’ and ‘error’ datasheets where users can define the name of the outputs in the first row with their associate values in the successive rows.

Note: in case there exist missing data points on the data table, they need to be filled by “NaN” (Not-a-Number) so that these data points will not be taken into the calculation of the fitting cost and they will not be displayed on the plots.