Skip to content
This repository has been archived by the owner on Jan 31, 2022. It is now read-only.

Add minimization algorithm for Acyclic FST #49

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AdrienDuff
Copy link

The implementation follow the Bottom-Up Algorithm from http://www.cs.jhu.edu/~hajic/courses/cs226/alg.html

  • Does not support weighted graphs.
  • Does not support gradients.
  • Does not check if graph is Acyclic.

Tests are coming soon.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 21, 2021
Copy link
Contributor

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I left some comments, please make the changes and then I will review again.

Also:

  1. Add tests
  2. Add a benchmark so we can tell how changes effect times.

@@ -151,5 +151,8 @@ Graph viterbiScore(const Graph& g);
*/
Graph viterbiPath(const Graph& g);

/** Minimize an Acyclic FST */
Graph minimizeAcyclicFST(const Graph& g);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just call this minimize. Ideally we would throw an error if a cycle is detected in the graph.



//Initialization
//a. Find all states with no outgoing arcs. (Since we are dealing with an acyclic FST, it is always possible.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would do is rename the function to minimize, and throw an error if this condition is not met (e.g. if we don't find any start or accept states with 0 outgoing nodes can assume that the graph is cyclic). In the future we can attempt to handle this case.

//a. Find all states with no outgoing arcs. (Since we are dealing with an acyclic FST, it is always possible.)
//b. Split the resulting set into 4 sets according to their START and ACCEPT status.

int nodeStartAccept = -1 , nodeStartNoAccept = -1, nodeNoStartAccept = -1, nodeNoStartNoAccept = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we assume in GTN that graphs are "trim" meaning there are no paths which lead to dangling nodes. You are welcome to make the same assumption, or alternatively if you find any node that is not accepting that has 0 outgoing arcs we should ignore it since it does nothing for the set of paths the graph allows and really shouldn't be part of a "minimal" graph.

Graph minimizeAcyclicFST(const Graph& g){
Graph graph;
std::vector<int> oldToNew(g.numNodes(), -1); // a map between the nodes of g and the minimized graph.
std::vector<int> oldProcessed; // store which nodes has been processed in the g graph
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should make this a vector<bool> processed (the name old is redundant). Make it the size of the old graph number of nodes and initialize every element to false. Then to check if a node is already processed check processed[n].

Comment on lines +409 to +410
if (std::all_of(g.out(predNode).begin(), g.out(predNode).end(),
[&g, &oldProcessed](int a) {return std::count(oldProcessed.begin(), oldProcessed.end(), g.dstNode(a)) > 0;})){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change oldProcessed to processed as mentioned above and simplify this accordingly.

Comment on lines +350 to +356
if ( std::equal(g.out(node1).begin(), g.out(node1).end(), g.out(node2).begin(), [&g, &oldToNew](int a1, int a2){
return (g.ilabel(a1) == g.ilabel(a2) &&
g.olabel(a1) == g.olabel(a2) &&
oldToNew[g.dstNode(a1)] == oldToNew[g.dstNode(a2)]);})
){
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: replace this block with return std::equal(...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants