Computer Science Coursework
Note: I can grant access to any repos that are marked private, if you are interested in viewing the code.
- Programming Languages and Translators (Jan 2022 - May 2022)
- Class Description: Wrote Viz Programming Language From Scratch. Semester long project where we wrote a compiler in OCaml that implements our defined language.
We learned about scanning, parsing, abstract syntax trees, semantic checking, LLVM IR code,
compiler and code optimization, and more. Check out all of our source code and documents below!
- Advanced Databases (Jan 2022 - May 2022)
- Class Description: This was a course where we moved beyond normal RDBMs. We learned about Information Retrieval from unstructured text databases and how to source, rank,
and parse text documents efficiently. We then moved on to Information Extraction using web doucments and learned how modern web search engines and
algorithms like Google and PageRank work. Next, we dabbled with Data Mining using a real-world curated dataset to extract interesting association rules using the
APrioi algorithm. Lastly, we switched gears to OLAP and how DB systems can support different types of data queries.
- Course Projects:
- Cloud Computing and Big Data (Jan 2022 - May 2022)
-
Class Description: Learned about how to design serverless, asynchronous, scalable systems that can handle high loads using
cloud microservice architectures. We focused on AWS, but the principles apply to other cloud providers. The decoupled nature
of microservice system design allows for systems to perform well at scale. Further, we experimented with cutting edge resources
and learned how to utilize logging, asynchronous triggers, message queues, NoSQL databases, as well as big data tools such as
Map Reduce and Spark. These tools allow us to utilize the full capabilities of the cloud in order to design asynchronous systems
that are decoupled and scalable, and can perform big data operations on massive datasets.
- Surfworld Project: Built a sleekly designed web app that provides value to avid surfers, by aggregating surf specific data by the world's popular
surf breaks and other third-party APIs. We provide the data in an easy to digest way, allowing surfers to search for data near their
location (using geo-locations) or by searching for specific places. We leveraged AWS cloud services in order to build a fully decoupled
system that could scale to a large number of users.
- Course Projects (Private):
- Chatbot: Implement a serverless, microservice-driven web application. Specifically, you will build a Dining Concierge
chatbot that sends you restaurant suggestions given a set of preferences that you provide the chatbot with thorough conversation.
- Photo Album: Implemented a photo album web application that can be searched using natural language through both text
and voice. You will learn how to use Lex, ElasticSearch, and Rekognition to create an intelligent search layer to query your
photos for people, objects, actions, landmarks and more.
- Email Service: Implemented a machine learning model to predict whether a message is spam or not. Furthermore, you will create a
system that upon receipt of an email message, it will automatically flag it as spam or not, based on the prediction obtained from the machine learning model.
- Engineering Software as a Service (Sep 2021 - Dec 2021)
- Class Description: Crash course in modern software engineering project management skills, such as
Test-Driven and Behavior-Driven Development using Rspec and Cucumber. We also learned
how to take ownership of the entire software development life cycle. Through
iterative management of a Minimum Viable Product (MVP) we worked closely with our mentors to ensure that we built
the "correct application," rather than build the "application correctly." Lastly, we
learned how to learn new frameworks and methodologies on the fly, such as Ruby and Rails in order to develop our SaaS web application.
- InGame Web Application:
Our application provides gamers have a place to share and store in-game
stills or live clips that they have created using gallery mode developer tools.
Gaming studios have developed advanced screen capture functionality and gallery modes that allow
the gamer to screenshot in game moments. These features have evolved to the point where
gamers are a burgeoning content creation market.
By putting this power into the hands of the user, gamers can push the
limits of the game’s intended use to create "new media art" and share their in game achievements.
As the market for this content grows, there will need to be a place where you store, refine
and share your content.
- Project Deliverables:
InGame Web Application (Hosted on Heroku),
InGame Live Demo,
InGame Demo Slides,
InGame Proposal Slides
- Source Code: InGame Rails App Code (Private)
- Additional Notes:
- Over 4 iterations, we utilized Agile Development key concepts to produce, refactor, and refine our SaaS web application.
- Pitched our application to our classmates and professor, over a proposal, demo and launch similar to an investor meeting.
- Have a live deployment of our app on Heroku. Gamers feel free to join our community!
- Leveraged AWS as our third party storage for all content uploaded by InGame users.
- Utilized PostgreSQL for our database schema, and to establish relationships between our different internal models.
- Codebase written with Ruby on Rails, which allowed us to leverage Model-View-Controller architecture from Rails framework.
- Implemented simple and elegant minimalist interface so that we don't degrade the user experience.
- C++ for C Programmers (May 2021 - Jun 2021)
- mystring.cpp, Makefile, test1.cpp, ..., test5.cpp:
Analyzed when constructor, destructor,
move (copy) assignment (construction) are
called on the stack by writing a C++ string class
(defined as an interface in .h, implemented in .c file)
and various tester files.
- mdb-grep.cpp, mdb.cpp, mdb.h:
Focused on advanced C++ language features (multiple inheritence, virtual
functions, initializer lists, RAII) and STL classes (regex
pattern matching and C++ file streams) to binary write (read) database structures
to (from) files and search database files for specific patterns using regex objects.
- xdb.h, exception.h, helper.h, maker.cpp, maker.h,
mymake.cpp, MyMakefile:
Constructed a C++ utility to compile programs by implementing a subset of GNU
make’s functionality by populating a rules map and writing an algorithm to
recursively build target executables. Then, optimized the build process by
caching the rules map, via a generic template database structure,
to a serialized mymake.cache file. This package has advanced code organization
through a shared namespace, and extensive error checking and leverages try/catch
blocks that check for and recover from user defined exceptions to ensure that the
build software never crashes.
- Analysis of Algorithms (Sep 2021 - Dec 2021)
- Algorithmic design for complex problems, and how to analyze algorithmic correcness and performance.
- Main Topics Covered: sorting algorithms, recursion, recurrence relations, divide and conquer, data structures,
greedy algorithms, dynamic programming, graphs and advanced graph/tree algorithms, networks flows, linear programming,
randomized algorithms, NP-Completeness.
- Reductions from problems with no clear solution to ones in my toolkit that have efficient and elegant solutions.
- Although we placed on emphasis on psuedocode designs, I wrote the following python solutions:
- sorting and data structures: QS_3_way_partition.py, bin_packing.py, BST_sorted_interval.py, bucket_sort.py,
counting_sort.py, merge_k_sorted_lists.py, smaller_than_k_minheap.py.
- greedy and dynamic programming: guards.py, max_independent_set.py, scheduling.py, weighted_jobs.py.
- graphs and network flows: articulation_point.py, bipartite.py, min_edge_dijkstra.py,
min_spanning_tree.py, SCC_dag.py, num_paths.py, profit_cost_ratio.py.
- Introduction to Databases (Jan 2021 - Apr 2021)
- Restaurant Database Design Group Project:
- Entity-Relationship Design: Performed E-R modeling for our restaurant database
and converted the abstract
schema to relational db model.
- Table Design and Data Creation: Defined SQL schema in PostgreSQL,
uploaded data, and tested the database
with advanced queries.
- Three Tier Architecture Web Interface:
Hosted the restaurant website content on Google Cloud Platform and developed front end interface. Used Python
and Flask to communicate with the PostgreSQL database, and Jinja framework to display query results.
The architecture served static and dynamic content to the client.
- Object-Oriented Features:
Added advanced SQL features, such as triggers, text, and arrays, to further refine our database
and better model a restaurant's digital needs.
- Homework Assignments:
- Performed various complex SQL queries for a common class database, and added security features such as
views, permissions, and roles.
- Studied differnet database models such as object-relational DBs, analyzed indexing and physical storage
system costs, and query processing/optimization.
- Analyzed advanced database theory (serialization, transactions, and normalization).
- Computer Networks (Sep 2021 - Dec 2021)
- Learned about the technical foundations of the internet in a "top down fashion" including applications, protocols,
algorithms (routing, transport/congestion control), and LANs.
- Hands on experience with the network protocol stack: Application, Transport, Network and Link layers.
- Application Layer Programming Assignment, Columbia ReopenCU imitator:
- Using Python Sockets API, wrote app layer program that modeled Columbia University's reopenCU application
for Covid-19 daily admittance to campus for non-symptomatic students.
- Wrote 4 programs, 2 sets of client-server programs one of which used TCP protocol and the other used UDP protocol.
- Transport Layer Programming Assignment, TCP Reliable File Transfer over UDP Socket API:
- Using Python Sockets API, wrote an application layer program that provided reliable file transfers over
unreliable data channels.
- Implemented TCP reliability over UDP Socket API and simulated unreliable noisy transfer mediums using
an emulator that simulated packet loss, corruption, delay, and more. In short, I implemented all facets of
TCP protocol from packet timers, in-order delivery, checksum, and more while using UDP sockets for transport.
- Developed my own packet wrapper API, which serialized and deserialized data to be sent over the internet
via the emulator proxy program.
- This project was a good simulation for how we can implement reliability at the application layer while utilizing
noisy and unreliable data channels at the transport layer. This feature allows higher layers in the protocol stack
at the end host to be more complex, thus relieving pressure placed on core network hardware
(routers and switches).
- Artificial Intelligence (Jan 2021 - Apr 2021)
- 2048 Puzzle Game: Implemented an adversarial search
agent to play using expectiminimax and alpha-beta pruning.
- Machine Learning: Implemented Perceptron, Linear
Regression, Clustering Machine Learning algorithms for
data separation, prediction, and RGB values.
- N-Puzzle Board Game: Implemented and compared
breadth-first, depth-first, A-star search algorithms.
- Sudoku: Implemented backtracking search with
minimum remaining value heuristic to reduce
variables domains.
- Python Review: Solved various coding problems, with
an emphasis on python data structures (lists, priority
queues, graphs, sets, and more).
- Operating Systems I (Jan 2021 - Apr 2021)
- shell.c, entry.S, main.c, bare_hello.S, floppy.flp: Wrote a simple bash-like shell program, to understand
the basic services provided by modern operating systems. Studied the bootstrap process (from BIOS loading the MBR to
GRUB bootloader) through writing x86 assembly code.
- prinfo.h, ptree.c, syscalls.h: Practiced building and running the complex linux kernel. Traced the whole process
tree along with process descriptors (name, state, pid, parent pid) with a kernel system call.
Wrote user space C code to pull and print the process tree in breadth first order.
- pstrace.c, pstrace.h: Wrote system calls that trace process state changes
(TASK_RUNNINNG, ... , EXIT_ZOMBIE, EXIT_DEAD), and copy priviliged data from kernel to user space.
- core.c, wrr.c: Hacked linux kernel scheduler, by defining a multicore round-robin
scheduling class. Protected the integrity of each CPU core and data structures
through synchronization and load balancing.
- expose_pgtbl.c, expose_pgtbl.h: Paged virtual memory addresses to physical memory addresses,
and investigated linux process address space through C code that triggered memory meta data changes.
- inode.c, file.c: Implemented ppagefs psuedo file-systems (completely in
RAM, not backed up to disk).
- Advanced Programming in C (Sep 2020 - Dec 2020)
- mylist.h, mylist.c: Created Generic Library.
- Wrote a generic singly linked list that could hold
any data type, along with associated generic methods.
- Used function pointers to take command line args
in argv and return them in reverse order.
- mdb.c: Designed our own data structure that could be binary
written (read) to (from) a file.
- mdb-lookup-server-nc.sh: Shell scripting, linux processes
management (fork/exec and all the various process states),
TCP/IP networking.
- Used netcat to connect to our database remotely to
perform various C I/O functions while using the
fork/exec paradigm.
- http-client.c, http-server.c: Leveraged C sockets API and HTTP protocol
to develop a webpage following three tier architecture.
- The HTTP client would recieve data transferred
over the internet.
- The HTTP server could serve static content, defined in the website directories,
and/or interface with our previously defined database to service dynamic database queries
and render the results on an webpage.
- Data Structures in Java (May 2020 - Jul 2020)
- BigO-and-Generic: Implemented generic classes and interfaces,
linear search and recursive generic binary search algorithms,
and Big-O Analysis.
- Stacks-and-Queues: Created a symbol balance class that
iterates through Java source code ensuring that
syntax symbols match by utilizing stack data structures.
Also, implemented a generic stack class, and a two stack queue.
- Trees: Used recursive algorithms to build and traverse
nodes in an expression trees (postfix, prefix, and infix).
For Binary Search Trees, I wrote algorithms that recursively
mirror the tree, traverse and find "the smallest value greater than x",
and "pretty print" the node values of the binary tree.
- Maps-and-heaps: Implemented a K-best algorithm and a
spell checking class.
- Dijkstra: Determined optimal routing from city x to city y, highlighted
in the Display.java GUI, using Dijkstra's greedy algorithm.
- Introduction to Computer Science & Programming in Java (Sep 2019 - Dec 2019)
- Drunkard's Walk and Leap Year Tester: Wrote a Java class to test
whether a certain year is a leap year. Another class simulates
the Drunkard's Walk model.
- Cryptography: Encryption and decryption of user inputted messages
through Substitution (Caesar Cipher), Additive (XOR Cipher), and
Transposition (Columnar) techniques.
- Video Poker: Developed a poker game where a user plays against the
CPU with varying payout multipliers accoring to the desired bet.
- Fail2Ban: Parsed error logs to
determine the most frequent IP Addresses that fail to
login to Columbia's network and write the results to an output log file.
- Computing in Context (Python) (Sep 2018 - Dec 2018)
- Bulls and Cows: Created a game where a user plays against the
CPU to guess a secret randomly generated number. The game ends when the
user gets 4 bulls, i.e the CPU's secret number.
- Scrabble Dictionary: Used Python I/O, dictionaries, and string methods to
parse a dictionary text file and assist the user with finding words of a certain length, words
that begin with a certain letter, and words that contain a certain letter.
- Analyzing NYC Parking Ticket Data: Analyzed NYC parking ticket data from NYC Open Data
to determine which year had the highest outstanding dollar balance, which license plate was the offender,
and which diplomatic consulate had the highest outstanding parking tickets balance.
- Computational Finance: Monte Carlo Simulation and option pricing.
- Schelling's Model of Segregation: Used numpy to
simulate self-segregation according to a threshold t
of people of a similar race. Run the visualize.py code
from the command line to see the model.