Serverless data pipelines at scale using AWS

Can you move your data pipelines to a serverless architecture? Should you? At GumGum we just built such a pipeline at scale using S3, Glue, Athena and Spectrum. Here are our tips and our feedback on the limitations of such an architecture.

As machine learning engineers, the CV and NLP teams in GumGum work towards improving GumGum’s existing CV and NLP capabilities, developing solutions for new advertising campaigns and maintaining code in a production environment.

Below is the recording of the presentation and the meetup discussion.

At GumGum, many of our applications must keep a record of what operations were done in the database (triggered by the API or the application). Most commonly, this implies keeping track of each user's interaction with the system, including metadata about the type, time, and trigger of the interaction. We sometimes refer to this type of logging as an audit trail and consider it a crosscutting concern that must be addressed while designing the system.

Markerless Augmented Advertising for Sports Video

For the past several years, GumGum has participated as an industry sponsor in the RIPS program, which is hosted by the Institute of Pure and Applied Mathematics at UCLA. The Research in Industrial Projects (RIPS) provides undergraduate students an opportunity to work on a real-world research project. This year, the proposed project was Markerless Augmented Advertising for Sports Video. A video of the full execution can be found here:

Japanese Named Entity Recognition
Ansible LA Meetup: Delivering Quality Automations with Ansible, Molecule and Drone

Presentation and Meetup Discussion made by Florian Dambrine, Senior DevOps Engineer on Test-Driven Insfrastructure concepts and the use of Ansible Molecule for testing Ansible roles on Docker using Drone pipelines.