The goal of the last homework is for students to expand their experience with solving distributed computational problems using cloud computing technologies. Our goal is to design and implement a variant of the graph game called Policeman and Thief, deploy it on the cloud as a microservice and enable clients to play it using HTTP requests.
This Git repo contains the homework description that uses an open-source implementation of a network simulator in Scala.
As part of this previous homework assignments students have learnt how to create and manage Git project repository, create an application in Scala, create tests using widely popular Scalatest framework, and expand on the provided SBT build and run script for their applications based on randomly generated graphs that represent big data.
First things first, if you haven't done so as part of the first homework, you must create your account at Github, a Git repo management system. Please make sure that you write your name in your README.md in your repo as it is specified on the class roster. Since it is a large class, please use your UIC email address for signing your projects and you should avoid emails from other accounts like [email protected]. As always, the homeworks class' Teams channel is the preferred way to exchange information and ask questions. If you don't receive a response within 12 hours, please contact your TA or me by tagging our names. If you use emails it may be a case that your direct emails went to the spam folder.
Next, if you haven't done so as part of your first homework, you will install IntelliJ with your academic license, the JDK, the Scala runtime and the IntelliJ Scala plugin and the Simple Build Toolkit (SBT) and make sure that you can create, compile, and run Java and Scala programs. Please make sure that you can run various Java tools from your chosen JDK between versions 8 and 19.
In all homeworks you should use logging and configuration management frameworks. You will comment your code extensively to describe your design choices and intricate, i.e., nontrivial details of your implementation, and you will add logging statements at different logging levels (e.g., TRACE, INFO, WARN, ERROR) to record information at some salient points in the executions of your programs. All input configuration variables/parameters must be supplied through configuration files -- hardcoding these values in the source code is prohibited and will be punished by taking a large percentage of points from your total grade! You are expected to use Logback and SLFL4J for logging and Typesafe Conguration Library for managing configuration files. These and other libraries should be imported into your project using your script build.sbt. These libraries and frameworks are widely used in the industry, so learning them is the time well spent to improve your resumes. Also, please set up your account with AWS Educate. Using your UIC email address may enable you to receive free credits for running your jobs in the cloud. Preferably, you should create your developer account for $29 per month to enjoy the full range of AWS services, if you haven't already done so.
As you already know the implementation of the network simulator demonstrates how to use Scala to create a fully functional (not imperative) implementation with subprojects and tests. As you see from the StackOverflow survey, knowledge of Scala is highly paid and in great demand, and it is expected that you pick it relatively fast, especially since it is tightly integrated with Java. I recommend using the book on Programming in Scala Fourth and Fifth Editions by Martin Odersky et al. You can obtain this book using the academic subscription on Safari Books Online. There are many other books and resources available on the Internet to learn Scala. Those who know more about functional programming can use the book on Functional Programming in Scala published in 2023 by Michael Pilquist, Rúnar Bjarnason, and Paul Chiusano.
When creating your RESTful/gRPC in Scala, you should avoid using vars and while/for loops that iterate over collections using induction variables. Instead, you should learn to use collection methods map, flatMap, foreach, filter and many others with lambda functions, which make your code linear and easy to understand. Also, avoid mutable variables that expose the internal states of your modules at all cost. Points will be deducted for having unreasonable vars and inductive variable loops without explanation why mutation is needed in your code unless it is confined to method scopes - you can always do without it.
A game on graph is a general term for a competition conducted according to rules with the participants (players) in direct opposition to each other where these participants make moves based on some graph structure with a predefined system of rewards and penalties. A general representation of the Policeman/Thief game is to assign one or more Policeman (P) and Thief (T) to different nodes in a graph initially. A player chooses to represent a P or a T. Each player takes turns moving its corresponding P or T using directed edges between nodes. Some nodes have the attribute ValuableData and if T ends up at a node with this attribute then s/he wins the game and P loses the game. However, if P and T end up at the same node then P wins the game and T loses it. If either P or T end up at a node where there are no moves available then the stuck player with no available moves loses the game. These are the basic rules of the game.
We extend the P/T game by creating the original graph (OG) and its perturbed representation (PG). The game starts by placing P and T randomly at some nodes in the OG and their counterparts at PG. If T is placed at the node with the valuable data then T wins by default and the game restarts. P and T query the PG to obtain information about their own and the opponent's location nodes and the adjacent nodes. Each of these nodes comes with some confidence score that the node or the edges that lead to it were not perturbed. For example, the node may remain the same in PG and four out of five of its connected edges remain unchanged thus making it 5/6 confidence score (C) ranging from zero to 1. Players can also query the PG to determine how far they are from the nearest node with the valuable data. If a player makes a move in PG that cannot be performed in OG s/he loses the game in addition to the basic game rules. Using this limited information players make their moves until one of them wins/loses the game.
Students will design and implement the P/T game on the cloud using microservices and participants (clients) can play this game over the Internet. Players submit their moves/queries using curl or Postman or using HTTP client request functionality in IntelliJ and the game server will accept these requests and produce responses to the clients. Graduate students should implement automated client programs that play the game to its completion using their own game strategies.
Your homework assignment consists of two interlocked parts: first, construct HTTP requests and responses for playing the game and second, implement the game server using microservices that receive these HTTP requests and reply to them using the rules of the P/T game. You will deploy an instance of the game engine on AWS EC2 and configure it to enable clients to play the P/T game. You can create additional projects for the RESTful service using the forked repo for NetGameSim if you wish to do so and extend the sbt script to include these subprojects as dependencies.
Your job is to design the P/T game, explain your design and architecture, and then to implement it and to run on the big data graphs that you will generate using your predefined configuration parameters for NetGraphSim. To implement a RESTful service for retrieving log messages you can use one of the popular frameworks: Play or Finch/Finagle or Akka HTTP or Scalatra. There is a discussion thread on Reddit about which framework software engineers prefer to create and maintain RESTful services. Personally, I like Akka HTTP but students are free to experiment with more than one framework. On a side note it would be very suspicious for your TA and me to see very similar implementations of the service using the same framework that came from different submissions by different students.
Regarding testing client programs to test your RESTful services you can implement them as a Postman project or as a curl command in a shell script or you can write a Scala program that uses Apache HTTP client library.
Next, after creating and testing your programs locally, you will deploy it and run it on the AWS. You will produce a short movie that documents all steps of the deployment and execution of your program with your narration and you will upload this movie to youtube and as before you will submit a link to your movie as part of your submission in the README.md file. To produce a movie, you may use an academic version of Camtasia or some other cheap/free screen capture technology from the UIC webstore or an application for a movie capture of your choice. The captured web browser content should show your login name in the upper right corner of the AWS application and you should introduce yourself in the beginning of the movie speaking into the camera.
The output of your program is a data file in some format of your choice, e.g., Yaml or CSV with the required statistics. The explanation of the REST protocol is given in the main textbook and elsewhere and covered in class lectures. After creating and testing your game program locally, you will deploy it and run it on the AWS EC2 - you can find plenty of documentation online. Just like for the previous homeworks you will produce a short movie that documents all steps of the deployment and execution of your program with your narration and you will upload this movie to youtube and you will submit a link to your movie as part of your submission in the README.md file. To produce a movie, you may use an academic version of Camtasia or Zoom or some other cheap/free screen capture technology from the UIC webstore or an application for a movie capture of your choice. The captured web browser content should show your login name in the upper right corner of the AWS application and you should introduce yourself in the beginning of the movie speaking into the camera.
Graduate students should create automatic client players that use some predefined strategies to play the game. That is, the clients send requests to the game server, receive replies and determine what next steps they should take. The report should discuss the results of experiments and how some strategies may lead to different outcomes.
The game server implementation should use gRPC to invoke a lambda function deployed on AWS as part of your game server design and implementation. The starting point is to follow the guide on AWS Serverless Application Model (SAM). Once you follow the steps of the tutorial, you will be able to invoke a lambda function via the AWS API Gateway.
Next, you will learn how to create a gRPC client program. I find this tutorial on gRPC on HTTP/2 very well written by Jean de Klerk, Developer Program Engineer at Google.
After that you will learn about AWS API Gateway and determine how to use it to create RESTful API for your implementation of the lambda function.
A guide to keep you on the right path is the blog entry that describes the process of using gRPC for invoking AWS lambda function in Go.
Excellent guide how to create a REST service with AWS Lambda includes instructions on how to set up and configure AWS Lambda.
Your baseline project submission should include your implementation, a conceptual explanation in the document or in the comments in the source code of how your Spark/GraphX processing components work to solve the problem for Option 1 group or how your Wangle distributed object pipeline works for Option 2 group, and the documentation that describe the build and runtime process, to be considered for grading. Your should use markdown for your project's Readme.md. Your project submission should include all your source code as well as non-code artifacts (e.g., configuration files), your project should be buildable using the SBT, and your documentation must specify how you paritioned the data and what input/outputs are.
You can post questions and replies, statements, comments, discussion, etc. on Teams using the corresponding channel. For this homework, feel free to share your ideas, mistakes, code fragments, commands from scripts, and some of your technical solutions with the rest of the class, and you can ask and advise others using Teams on where resources and sample programs can be found on the Internet, how to resolve dependencies and configuration issues. When posting question and answers on Teams, please make sure that you selected the appropriate channel, to ensure that all discussion threads can be easily located. Active participants and problem solvers will receive bonuses from the big brother :-) who is watching your exchanges (i.e., your class instructor and your TA). However, you must not describe intricate details of your architecture or your models!
This is an individual homework. Please remember to grant a read access to your repository to your TA and your instructor. You can commit and push your code as many times as you want. Your code will not be visible and it should not be visible to other students - your repository should be private. Announcing a link to your public repo for this homework or inviting other students to join your fork for an individual homework before the submission deadline will result in losing your grade. For grading, only the latest commit timed before the deadline will be considered. If your first commit will be pushed after the deadline, your grade for the homework will be zero. For those of you who struggle with the Git, I recommend a book by Ryan Hodson on Ry's Git Tutorial. The other book called Pro Git is written by Scott Chacon and Ben Straub and published by Apress and it is freely available. There are multiple videos on youtube that go into details of the Git organization and use.
Please follow this naming convention to designate your authorship while submitting your work in README.md: "Firstname Lastname" without quotes, where you specify your first and last names exactly as you are registered with the University system, as well as your UIC.EDU email address, so that we can easily recognize your submission. I repeat, make sure that you will give both your TA and the course instructor the read/write access to your private forked repository so that we can leave the file feedback.txt in the root of your repo with the explanation of the grade assigned to your homework.
As it is mentioned above, you can post questions and replies, statements, comments, discussion, etc. on Teams. Remember that you cannot share your code and your solutions privately, but you can ask and advise others using Teams and StackOverflow or some other developer networks where resources and sample programs can be found on the Internet, how to resolve dependencies and configuration issues. Yet, your implementation should be your own and you cannot share it. Alternatively, you cannot copy and paste someone else's implementation and put your name on it. Your submissions will be checked for plagiarism. Copying code from your classmates or from some sites on the Internet will result in severe academic penalties up to the termination of your enrollment in the University.
Sunday, November 19, 2023 at 11:59PM CST by submitting the link to your homework repo in the Teams Assignments channel. Your submission repo will include the code for the program, your documentation with instructions and detailed explanations on how to assemble and deploy your program along with the results of your program execution, the link to the video and a document that explains these results based on the characteristics and the configuration parameters of your log generator, and what the limitations of your implementation are. Again, do not forget, please make sure that you will give both your TAs and your instructor the read access to your private repository. Your code should compile and run from the command line using the commands sbt clean compile test and sbt clean compile run. Also, you project should be IntelliJ friendly, i.e., your graders should be able to import your code into IntelliJ and run from there. Use .gitignore to exlude files that should not be pushed into the repo.
- the maximum grade for this homework is 20% plus up to 5% bonus for fully completing the optional part. Points are subtracted from this maximum grade: for example, saying that 2% is lost if some requirement is not completed means that the resulting grade will be 20%-2% => 18%; if the core homework functionality does not work or it is not implemented as specified in your documentation, your grade will be zero;
- only some basic RESTful examples from some repos are given and nothing else is done: zero grade;
- not implementing the game algorithm: 10% penalty;
- not implementing the automatic client playing program for graduate students results in 5% loss;
- having less than five unit and/or integration scalatests: up to 10% lost;
- missing comments and explanations from your program with clarifications of your design rationale: up to 15% lost;
- logging is not used in your programs: up to 5% lost;
- hardcoding the input values in the source code instead of using the suggested configuration libraries: up to 5% lost;
- for each used var for heap-based shared variables or mutable collections: 0.3% lost;
- for each used while or for or other loops with induction variables to iterate over a collection: 0.5% lost;
- no instructions in README.md on how to install and run your program: up to 10% lost;
- the program crashes without completing the core functionality: up to 15% lost;
- the documentation exists but it is insufficient to understand your program design and models and how you assembled and deployed all components of your game implementation: up to 20% lost;
- the minimum grade for this homework cannot be less than zero.
That's it, folks! The semester is almost over!