Solvers for the Wordle game - Evaluation of strategies

Wordle is a web-based word game which has become incredibly popular during the pandemic. It became so popular over a while that it was even bought by New York times for a significant sum and is currently hosted there. The game is a lot of fun to solve manually, but I am also interested in solving this computationally. This is my attempt at coming up with a solution strategy for the game.

Using Hugging Face Transformers on AWS Sagemaker

In July 2021, AWS and Hugging Face announced collaboration to make Hugging Face a first party framework within SageMaker. Earlier, you had to use PyTorch container and install packages manually to do this. With the new Hugging Face Deep Learning Containers (DLC) availabe in Amazon SageMaker, the process of training and deploying models is greatly simplified.

In this post, we will go through a high level overview of Hugging Face Transformers library before looking at how to use the newly announced Hugging Face DLCs within Sagemaker.

Previewing command line JSON output using firefox

Firefox, like other modern browsers, has an excellent in-built JSON viewer. It also supports Data URIs which allows you to load HTML resource from text in URL as if they were external resources. We can make use of these two features to have a handy JSON previewer which can be invoked from command line.

For example, when you enter the below link into your browser, it opens a “Hello world” text document.

1
data:,Hello%2C%20World!

This content is not limited to plain text. It can even be an HTML document:

1
data:text/html,%3Ch1%3EHello%2C%20World!%3C%2Fh1%3E

Writing BDD tests for Terraform Code Using Terratest

Terratest is a popular library for testing Terraform code. Testing Infrastructure As Code (IAC) is not as widespread as it should be. The reasons are multi-fold, ranging from developer’s attitude towards testing to the difficulty of writing unit tests because of inherent side effects of IAC. Nevertheless, testing is no less important, in particular under these scenarios:

  1. When your module gets complicated, with medium to complex behaviour logic
  2. When your module makes underlying assumptions of external dependencies (such as AWS SCPs at Organization level permitting certain actions)

In this post, we will take a look at using Terratest to test Terraform code. A typical Terratest testing pattern involves:

  1. Deploying real infrastructure in real environment
  2. Asserting that the deployed resources behaves as expected
  3. Undeploy everything at the end of the test.

Behavior Driven Test (BDD) uses examples to describe the behavior of a system. It serves the dual purpose of testing the code and documenting it at the same time. Terratest is not a BDD testing framework, however it is possible to write BDD tests that executes Terratest code. In a later section of this post, we will see how this can be achieved using Godog which is a Go BDD testing library.

GPT-3 and prospects of Artificial General Intelligence

Last year OpenAI released the Generative Pre-trained Transformer 2 (GPT-2) model. GPT-2 was a language model with 1.5 billion parameters, trained on 8 million web pages. It generated quite a buzz as it could generate coherent text, comprehend paragraphs, answer questions, and summarize text and do all sorts of smart stuff… all without any task-specific learning. OpenAI even deemed the model too dangerous to release but eventually ended up releasing them.

In May 2020, OpenAI released their follow-up GPT-3 model which took the game several notches higher. They trained it with 175 billion parameters, using close to half-a-trillion tokens. The model and its weights alone would take up 300GB VRAM. This is a drastic increase in scale and complexity, anyway you look at it. So what can a huge model like this achieve and why has it reinvigorated the talks ?

GPT-3 Training Size