Computer Science Coursework

Note: I can grant access to any repos that are marked private, if you are interested in viewing the code.


  1. Programming Languages and Translators (Jan 2022 - May 2022)
  2. Advanced Databases (Jan 2022 - May 2022)
  3. Cloud Computing and Big Data (Jan 2022 - May 2022)
    • Class Description: Learned about how to design serverless, asynchronous, scalable systems that can handle high loads using cloud microservice architectures. We focused on AWS, but the principles apply to other cloud providers. The decoupled nature of microservice system design allows for systems to perform well at scale. Further, we experimented with cutting edge resources and learned how to utilize logging, asynchronous triggers, message queues, NoSQL databases, as well as big data tools such as Map Reduce and Spark. These tools allow us to utilize the full capabilities of the cloud in order to design asynchronous systems that are decoupled and scalable, and can perform big data operations on massive datasets.
    • Surfworld Project: Built a sleekly designed web app that provides value to avid surfers, by aggregating surf specific data by the world's popular surf breaks and other third-party APIs. We provide the data in an easy to digest way, allowing surfers to search for data near their location (using geo-locations) or by searching for specific places. We leveraged AWS cloud services in order to build a fully decoupled system that could scale to a large number of users.
    • Course Projects (Private):
      • Chatbot: Implement a serverless, microservice-driven web application. Specifically, you will build a Dining Concierge chatbot that sends you restaurant suggestions given a set of preferences that you provide the chatbot with thorough conversation.
      • Photo Album: Implemented a photo album web application that can be searched using natural language through both text and voice. You will learn how to use Lex, ElasticSearch, and Rekognition to create an intelligent search layer to query your
      • photos for people, objects, actions, landmarks and more.
      • Email Service: Implemented a machine learning model to predict whether a message is spam or not. Furthermore, you will create a system that upon receipt of an email message, it will automatically flag it as spam or not, based on the prediction obtained from the machine learning model.
  4. Engineering Software as a Service (Sep 2021 - Dec 2021)
    • Class Description: Crash course in modern software engineering project management skills, such as Test-Driven and Behavior-Driven Development using Rspec and Cucumber. We also learned how to take ownership of the entire software development life cycle. Through iterative management of a Minimum Viable Product (MVP) we worked closely with our mentors to ensure that we built the "correct application," rather than build the "application correctly." Lastly, we learned how to learn new frameworks and methodologies on the fly, such as Ruby and Rails in order to develop our SaaS web application.
    • InGame Web Application: Our application provides gamers have a place to share and store in-game stills or live clips that they have created using gallery mode developer tools. Gaming studios have developed advanced screen capture functionality and gallery modes that allow the gamer to screenshot in game moments. These features have evolved to the point where gamers are a burgeoning content creation market. By putting this power into the hands of the user, gamers can push the limits of the game’s intended use to create "new media art" and share their in game achievements. As the market for this content grows, there will need to be a place where you store, refine and share your content.
      • Project Deliverables: InGame Web Application (Hosted on Heroku), InGame Live Demo, InGame Demo Slides, InGame Proposal Slides
      • Source Code: InGame Rails App Code (Private)
      • Additional Notes:
        • Over 4 iterations, we utilized Agile Development key concepts to produce, refactor, and refine our SaaS web application.
        • Pitched our application to our classmates and professor, over a proposal, demo and launch similar to an investor meeting.
        • Have a live deployment of our app on Heroku. Gamers feel free to join our community!
        • Leveraged AWS as our third party storage for all content uploaded by InGame users.
        • Utilized PostgreSQL for our database schema, and to establish relationships between our different internal models.
        • Codebase written with Ruby on Rails, which allowed us to leverage Model-View-Controller architecture from Rails framework.
        • Implemented simple and elegant minimalist interface so that we don't degrade the user experience.
  5. C++ for C Programmers (May 2021 - Jun 2021)
    • mystring.cpp, Makefile, test1.cpp, ..., test5.cpp: Analyzed when constructor, destructor, move (copy) assignment (construction) are called on the stack by writing a C++ string class (defined as an interface in .h, implemented in .c file) and various tester files.
    • mdb-grep.cpp, mdb.cpp, mdb.h: Focused on advanced C++ language features (multiple inheritence, virtual functions, initializer lists, RAII) and STL classes (regex pattern matching and C++ file streams) to binary write (read) database structures to (from) files and search database files for specific patterns using regex objects.
    • xdb.h, exception.h, helper.h, maker.cpp, maker.h, mymake.cpp, MyMakefile: Constructed a C++ utility to compile programs by implementing a subset of GNU make’s functionality by populating a rules map and writing an algorithm to recursively build target executables. Then, optimized the build process by caching the rules map, via a generic template database structure, to a serialized mymake.cache file. This package has advanced code organization through a shared namespace, and extensive error checking and leverages try/catch blocks that check for and recover from user defined exceptions to ensure that the build software never crashes.
  6. Analysis of Algorithms (Sep 2021 - Dec 2021)
    • Algorithmic design for complex problems, and how to analyze algorithmic correcness and performance.
    • Main Topics Covered: sorting algorithms, recursion, recurrence relations, divide and conquer, data structures, greedy algorithms, dynamic programming, graphs and advanced graph/tree algorithms, networks flows, linear programming, randomized algorithms, NP-Completeness.
    • Reductions from problems with no clear solution to ones in my toolkit that have efficient and elegant solutions.
    • Although we placed on emphasis on psuedocode designs, I wrote the following python solutions:
      • sorting and data structures: QS_3_way_partition.py, bin_packing.py, BST_sorted_interval.py, bucket_sort.py, counting_sort.py, merge_k_sorted_lists.py, smaller_than_k_minheap.py.
      • greedy and dynamic programming: guards.py, max_independent_set.py, scheduling.py, weighted_jobs.py.
      • graphs and network flows: articulation_point.py, bipartite.py, min_edge_dijkstra.py, min_spanning_tree.py, SCC_dag.py, num_paths.py, profit_cost_ratio.py.
  7. Introduction to Databases (Jan 2021 - Apr 2021)
    • Restaurant Database Design Group Project:
      • Entity-Relationship Design: Performed E-R modeling for our restaurant database and converted the abstract schema to relational db model.
      • Table Design and Data Creation: Defined SQL schema in PostgreSQL, uploaded data, and tested the database with advanced queries.
      • Three Tier Architecture Web Interface: Hosted the restaurant website content on Google Cloud Platform and developed front end interface. Used Python and Flask to communicate with the PostgreSQL database, and Jinja framework to display query results. The architecture served static and dynamic content to the client.
      • Object-Oriented Features: Added advanced SQL features, such as triggers, text, and arrays, to further refine our database and better model a restaurant's digital needs.
    • Homework Assignments:
      • Performed various complex SQL queries for a common class database, and added security features such as views, permissions, and roles.
      • Studied differnet database models such as object-relational DBs, analyzed indexing and physical storage system costs, and query processing/optimization.
      • Analyzed advanced database theory (serialization, transactions, and normalization).
  8. Computer Networks (Sep 2021 - Dec 2021)
    • Learned about the technical foundations of the internet in a "top down fashion" including applications, protocols, algorithms (routing, transport/congestion control), and LANs.
    • Hands on experience with the network protocol stack: Application, Transport, Network and Link layers.
    • Application Layer Programming Assignment, Columbia ReopenCU imitator:
      • Using Python Sockets API, wrote app layer program that modeled Columbia University's reopenCU application for Covid-19 daily admittance to campus for non-symptomatic students.
      • Wrote 4 programs, 2 sets of client-server programs one of which used TCP protocol and the other used UDP protocol.
    • Transport Layer Programming Assignment, TCP Reliable File Transfer over UDP Socket API:
      • Using Python Sockets API, wrote an application layer program that provided reliable file transfers over unreliable data channels.
      • Implemented TCP reliability over UDP Socket API and simulated unreliable noisy transfer mediums using an emulator that simulated packet loss, corruption, delay, and more. In short, I implemented all facets of TCP protocol from packet timers, in-order delivery, checksum, and more while using UDP sockets for transport.
      • Developed my own packet wrapper API, which serialized and deserialized data to be sent over the internet via the emulator proxy program.
      • This project was a good simulation for how we can implement reliability at the application layer while utilizing noisy and unreliable data channels at the transport layer. This feature allows higher layers in the protocol stack at the end host to be more complex, thus relieving pressure placed on core network hardware (routers and switches).
  9. Artificial Intelligence (Jan 2021 - Apr 2021)
    • 2048 Puzzle Game: Implemented an adversarial search agent to play using expectiminimax and alpha-beta pruning.
    • Machine Learning: Implemented Perceptron, Linear Regression, Clustering Machine Learning algorithms for data separation, prediction, and RGB values.
    • N-Puzzle Board Game: Implemented and compared breadth-first, depth-first, A-star search algorithms.
    • Sudoku: Implemented backtracking search with minimum remaining value heuristic to reduce variables domains.
    • Python Review: Solved various coding problems, with an emphasis on python data structures (lists, priority queues, graphs, sets, and more).
  10. Operating Systems I (Jan 2021 - Apr 2021)
    • shell.c, entry.S, main.c, bare_hello.S, floppy.flp: Wrote a simple bash-like shell program, to understand the basic services provided by modern operating systems. Studied the bootstrap process (from BIOS loading the MBR to GRUB bootloader) through writing x86 assembly code.
    • prinfo.h, ptree.c, syscalls.h: Practiced building and running the complex linux kernel. Traced the whole process tree along with process descriptors (name, state, pid, parent pid) with a kernel system call. Wrote user space C code to pull and print the process tree in breadth first order.
    • pstrace.c, pstrace.h: Wrote system calls that trace process state changes (TASK_RUNNINNG, ... , EXIT_ZOMBIE, EXIT_DEAD), and copy priviliged data from kernel to user space.
    • core.c, wrr.c: Hacked linux kernel scheduler, by defining a multicore round-robin scheduling class. Protected the integrity of each CPU core and data structures through synchronization and load balancing.
    • expose_pgtbl.c, expose_pgtbl.h: Paged virtual memory addresses to physical memory addresses, and investigated linux process address space through C code that triggered memory meta data changes.
    • inode.c, file.c: Implemented ppagefs psuedo file-systems (completely in RAM, not backed up to disk).
  11. Advanced Programming in C (Sep 2020 - Dec 2020)
    • mylist.h, mylist.c: Created Generic Library.
      • Wrote a generic singly linked list that could hold any data type, along with associated generic methods.
      • Used function pointers to take command line args in argv and return them in reverse order.
    • mdb.c: Designed our own data structure that could be binary written (read) to (from) a file.
    • mdb-lookup-server-nc.sh: Shell scripting, linux processes management (fork/exec and all the various process states), TCP/IP networking.
      • Used netcat to connect to our database remotely to perform various C I/O functions while using the fork/exec paradigm.
    • http-client.c, http-server.c: Leveraged C sockets API and HTTP protocol to develop a webpage following three tier architecture.
      • The HTTP client would recieve data transferred over the internet.
      • The HTTP server could serve static content, defined in the website directories, and/or interface with our previously defined database to service dynamic database queries and render the results on an webpage.
  12. Data Structures in Java (May 2020 - Jul 2020)
    • BigO-and-Generic: Implemented generic classes and interfaces, linear search and recursive generic binary search algorithms, and Big-O Analysis.
    • Stacks-and-Queues: Created a symbol balance class that iterates through Java source code ensuring that syntax symbols match by utilizing stack data structures. Also, implemented a generic stack class, and a two stack queue.
    • Trees: Used recursive algorithms to build and traverse nodes in an expression trees (postfix, prefix, and infix). For Binary Search Trees, I wrote algorithms that recursively mirror the tree, traverse and find "the smallest value greater than x", and "pretty print" the node values of the binary tree.
    • Maps-and-heaps: Implemented a K-best algorithm and a spell checking class.
    • Dijkstra: Determined optimal routing from city x to city y, highlighted in the Display.java GUI, using Dijkstra's greedy algorithm.
  13. Introduction to Computer Science & Programming in Java (Sep 2019 - Dec 2019)
    • Drunkard's Walk and Leap Year Tester: Wrote a Java class to test whether a certain year is a leap year. Another class simulates the Drunkard's Walk model.
    • Cryptography: Encryption and decryption of user inputted messages through Substitution (Caesar Cipher), Additive (XOR Cipher), and Transposition (Columnar) techniques.
    • Video Poker: Developed a poker game where a user plays against the CPU with varying payout multipliers accoring to the desired bet.
    • Fail2Ban: Parsed error logs to determine the most frequent IP Addresses that fail to login to Columbia's network and write the results to an output log file.
  14. Computing in Context (Python) (Sep 2018 - Dec 2018)
    • Bulls and Cows: Created a game where a user plays against the CPU to guess a secret randomly generated number. The game ends when the user gets 4 bulls, i.e the CPU's secret number.
    • Scrabble Dictionary: Used Python I/O, dictionaries, and string methods to parse a dictionary text file and assist the user with finding words of a certain length, words that begin with a certain letter, and words that contain a certain letter.
    • Analyzing NYC Parking Ticket Data: Analyzed NYC parking ticket data from NYC Open Data to determine which year had the highest outstanding dollar balance, which license plate was the offender, and which diplomatic consulate had the highest outstanding parking tickets balance.
    • Computational Finance: Monte Carlo Simulation and option pricing.
    • Schelling's Model of Segregation: Used numpy to simulate self-segregation according to a threshold t of people of a similar race. Run the visualize.py code from the command line to see the model.

Economics Coursework


  1. Sports Economics Senior Thesis (Jan 2019 - May 2019)
    • Major League Baseball Team Building: Should Franchises Focus on the Amateur Draft or Free Agency?
      • Abstract: This paper analyzes the relationship between the Major League Baseball amateur draft and team winning percentages from 2005 to 2018. To test the statistical significance of the amateur draft, I performed a two-stage least squared regression analysis that models the impact that draft success and payroll have on winning percentages. Understanding that drafting ability is not quantifiable, I use modern scouting theories to develop my first stage regressions and predict the rate at which teams find undervalued players. The second stage regressions formally test whether consistent draft success or free agency has a more statistically significant effect on winning. Even with significant investments in scouting and player development, this study refutes the theory that the amateur draft significantly impacts competitive balance.
    • Draft analysis STATA code