Over the last 10 years or so we have seen an exponential evolution of the Internet experience, we went from the linear desktop-based low-bandwidth kind-of-ugly interfaces to the modern responsive video-driven asynchronous Web that we all love nowadays. This happened due multiple factors but importantly because of the popularization of JavaScript as the de facto language of the Web Browser.

Cassandra to DynamoDB: Better performance and cost benefits

We recently  moved to Amazon DynamoDB from Apache Cassandra. In this blog I will be discussing the architecture, design decisions made in the process with their justifications and the steps taken in order to complete this switch.

Text extraction using Dragnet and Diffbot
Amazon Redshift Spectrum: Extending Our Data Warehouse Capabilities

Amazon Redshift is the primary data warehousing solution used at GumGum. Apart from the real time reports which are powered through Druid, Redshift fuels the majority of our reporting capabilities. Being a fully managed solution from Amazon, we do not have to maintain the Redshift cluster, be it the hardware or the redshift engine. Redshift is column oriented and its massively parallel processing architecture (MPP) makes it petabyte scalable. Amazon Redshift also includes Redshift Spectrum which can directly query unstructured data stored in S3.

GumGum is one of the enterprise level customers of AWS. We run our entire infrastructure on AWS. We run thousands of servers on AWS every day. AWS is very essential to GumGum’s success. And that’s why GumGum sent 8 of it’s engineers / engineering leaders to re:Invent this year. And from my personal experience, I think it’s was totally worth it. 

Okta Implementation at GumGum

Deep Learning is a buzz word that gets thrown around a lot these days. It’s thought of the “next big thing” which has already started turning many heads and convinced many who initially thought of it as a bubble.

Detection is one of many classic computer vision problems that has significantly improved with the adoption of convolutional neural networks (CNNs).  When CNNs rose to popularity for image classification, many relied on crude and expensive preprocessing routines for generating region-proposals.

This is a first post in the series of three posts on GumSmash, an In-House event driven auto-remediation engine based on Ansible, that we developed at GumGum.

GumGum has been nominated for Tech In Motion's 2017 "Timmy" Awards! We would love to have your vote. Please vote once per day throughout this month for Best Tech Startup and Best Tech Manager and be sure to share on social media.

This is the first post in the series of three posts on Kafka Connect Implementation at GumGum. In this post, I have going to explain why we chose to implement Kafka Connect and what impact it had on GumGum’s data processing architecture. 

AWS’s Database Migration Service (DMS) is often misunderstood as a service that can only migrate your data to the cloud. But the service could be very useful for replicating data between the two datastores within the cloud as well. 

Here at GumGum, we build and work on many React JS web applications. As the number of apps grew, we found ourselves writing the same code and components for each app. 

Enabling general log in your MySQL RDS instance can be very useful. Especially for auditing and accountability purposes. It’s usually useful to debug problems too.