This post outlines the basics of creating a project that combines Go (or "golang" as googling for "go" is a pain — why didn't the guys at Google think of this?) native backend serving a web UI / GUI running on SvelteKit.
In a nutshell, this involves creating a new go project, creating a simple web server program that supports serving files from a static folder, and finally creating a SvelteKit project and configuring it to produce static content into
that folder. But let's do a short detour on why this might be useful!
Combining native executable with Web UI
Native graphical user interfaces are not easy on any platform, and after
looking at Qt, WxWidgets, Electron etc. I decided all had either major
shortcomings, huge learning curves or resulted in way too large packages.
Doing a native web server, on the other hand, is quite easy using Go. I
also investigated C and C++, but at least on Windows you very quickly run
into MinGW vs. Visual Studio issues, runtimes, build systems and all that chaos,
whereas Go pretty much produces executables with minimum fuss.
Once you have a web server, you can just serve a web UI and the user
can run the executable and open the UI in their browser.
Simple web server with Go
Once you are comfortable creating a "Hello world" level app in Go, making a
simple app for web server is not too hard:
$ mkdir project
$ cd project
project$ go mod init example/project
Here's a simple web server you can paste into main.go
package main
import (
"encoding/json"
"log"
"mime"
"net/http"
)
func databases(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
w.Header().Set("Access-Control-Allow-Origin", "*") // for CORS
w.WriteHeader(http.StatusOK)
test := []string{}
test = append(test, "Hello")
test = append(test, "World")
json.NewEncoder(w).Encode(test)
}
func main() {
// Windows may be missing this
mime.AddExtensionType(".js", "application/javascript")
http.Handle("/test", http.HandlerFunc(databases))
http.Handle("/", http.FileServer(http.Dir("static")))
log.Fatal(http.ListenAndServe(":8080", nil))
}
OpenAI came out with ChatGPT, and wow, that is quite something. What is also remarkable is the
load the ChatGPT client is under, and how often it is "experiencing high demand".
Or just requires you to prove you are human and log in again.
You can get ChatGPT Plus for $20 a month, but hey, you can also get chat experience for $0.002 per 1000 tokens. To hit that monthly fee, you need to use 10 M tokens, which is not that far from 10 M words. That is pretty heavy use...
Using OpenAI ChatGPT (gpt-3.5-turbo) through Python API
To use the ChatGPT API, at its simplest form with Python3 you just pip install openai and create a short script:
#!/usr/bin/python3
import openai
import sys
openai.api_key = 'sk-yourkeyhere'
if len(sys.argv) < 2:
prompt = input('Prompt: ')
else:
prompt = ' '.join(sys.argv[1:])
resp = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a programming expert giving advice to a colleague."},
{"role": "user", "content": prompt}
]
)
print(resp['choices'][0]['message']['content'])
print('Usage was', resp['usage'])
You need to create credentials at OpenAI platform, enter your credit card and set a warning and hard treshold for monthly billing (I set mine to $4 and $10, respectively). But after filling your API key to the script, you can just run it:
$ python3 chat.py What is the capital of Alaska
The capital of Alaska is Juneau. However, I believe you were looking for programming advice. What specifically are you working on and what kind of advice are you seeking?
Usage was {
"completion_tokens": 34,
"prompt_tokens": 30,
"total_tokens": 64
}
Now that is pretty nice, but we can do better!
Golang client with SvelteKit frontend
In my previous Golang+SvelteKit GUI post I explored how to create a Go application acting as a web server and making a user interface with SvelteKit:
Golang has high performance and excellent set of libraries to accomplish many tasks
Cross-platform support out of the box with compact executables
SvelteKit is fast to develop as a frontend, requiring very low amount of code for rich interactive UIs
OpenAI does not produce it's own Go library, but that API as well documented and shabaronov has made an excellent Golang OpenAI API library that makes calling the API simple. It even supports GPT4, so if you have access to that, you can create a GPT4 chat client as well.
Without further ado, here's the Github repository for my GoChatGPT client. You can basically git clone://github.com/jokkebk/gochatgpt and follow the instructions in README.md to get it running, it's about 4 commands all in all.
Let's look a bit what the code does!
Golang ChatGPT Backend
Main method of the backend is nothing too complex:
Serve the SvelteKit GUI from static folder (including index.html when user requests /).
Have a chat endpoint at /chat that takes a JSON object with chat messages and passes it to OpenAI API.
Return the OpenAI [ChatGPT] response as a string to calling client.
func main() {
// Get the API key from the environment variable OPENAI_API_KEY
apiKey := os.Getenv("OPENAI_API_KEY")
client := openai.NewClient(apiKey)
http.Handle("/", http.FileServer(http.Dir("static")))
http.HandleFunc("/chat", func(w http.ResponseWriter, r *http.Request) {
var messages []openai.ChatCompletionMessage
err := json.NewDecoder(r.Body).Decode(&messages)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
ans, shouldReturn := chatGPT(client, messages)
if shouldReturn {
http.Error(w, "ChatGPT error", http.StatusBadRequest)
return
}
// Serialize the response as JSON and write it back to the client
w.Header().Set("Content-Type", "text/plain")
w.Write([]byte(ans))
})
address := "localhost:1234"
log.Printf("Starting server, go to http://%s/ to try it out!", address)
http.ListenAndServe(address, nil)
}
With the rather straightforward OpenAI, the chatGPT() function is nothing to write home about either – we get the chat messages, put them into ChatCompletion object, pass
to API and (hopefully) return the top answer (or an empty error if it failed):
Just a small note / Gist type of thing for today: I got tired of adding w.Header().Set("Access-Control-Allow-Origin", "*") to every handler function in my small Golang
web app. I'm using Julien Schmidt's excellent httprouter module for simple routing. Turns
out the Basic Authentication example is quite simple to adjust for a
set-and-forget type of httprouter.Handle middleware:
// https://github.com/julienschmidt/httprouter middleware to set CORS header
func MiddleCORS(next httprouter.Handle) httprouter.Handle {
return func(w http.ResponseWriter,
r *http.Request, ps httprouter.Params) {
w.Header().Set("Access-Control-Allow-Origin", "*")
next(w, r, ps)
}
}
Using the middleware is simple, just wrap your normal handler function:
router.GET("/someurl", MiddleCORS(SomeURLFunc))
Or both the middleware and the function it takes implement httprouter.Handle, you can just chain multiple middleware with MiddleCORS(AnotherMiddleware(SomeURLFunc)).
Huggingface'stransformers library is a great resource for natural language processing tasks, and it includes an implementation of OpenAI's CLIP model including a pretrained model clip-vit-large-patch14. The CLIP model is a powerful image and text embedding model that can be used for a wide range of tasks, such as image captioning and similarity search.
The CLIPModel documentation provides examples of how to use the model to calculate the similarity of images and captions, but it is less clear on how to obtain the raw embeddings of the input data. While the documentation provides some guidance on how to use the model's embedding layer, it is not always clear how to extract the embeddings for further analysis or use in other tasks.
Furthermore, the documentation does not cover how to calculate similarity between text and image embeddings yourself. This can be useful for tasks such as image-text matching or precalculating image embeddings for later (or repeated) use.
In this post, we will show how to obtain the raw embeddings from the CLIPModel and how to calculate similarity between them using PyTorch. With this information, you will be able to use the CLIPModel in a more flexible way and adapt it to your specific needs.
Benchmark example: Logit similarity score between text and image embeddings
Here's the example from CLIPModel documentation we'd ideally like to split into text and image embeddings and then calculate the similarity score between them ourselves:
from PIL import Image
import requests
from transformers import AutoProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(
text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True
)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
If you run the code and print(logits_per_image) you should get:
from PIL import Image
import requests
from transformers import AutoProcessor, AutoTokenizer, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
# Get the text features
tokenizer = AutoTokenizer.from_pretrained("openai/clip-vit-large-patch14")
inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
text_features = model.get_text_features(**inputs)
print(text_features.shape) # output shape of text features
# Get the image features
processor = AutoProcessor.from_pretrained("openai/clip-vit-large-patch14")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
image_features = model.get_image_features(**inputs)
print(image_features.shape) # output shape of image features
Looks pretty good! Two 768 item tensors for the two labels, and one similarly sized for the image! Now let's see if we can calculate the similarity between the two...
Today as I was finishing my Go+SvelteKit article, I ran into frustrating
Chrome error message:
Failed to load module script: Expected a JavaScript module script but the
server responded with a MIME type of "text/plain". Strict MIME type checking
is enforced for module scripts per HTML spec.
Don't you just love Chrome? It knows what it needs to do (load a JavaScript
module), but utterly refuses to do that because of a wrong MIME type. This
happened with a client-side SvelteKit application, when it tried to open some
part of the .js code.
At the time of writing, it seemed I could not find the answer easily to this
one, but there actually seems to be a StackOverflow solution discussing
this. But to help others hitting the same issue:
The problem on my Windows install was likely that Windows 10 registry did not
contain a MIME type definition for .js files. Informing user how to tweak
registry to get your program working is not ideal, but thankfully you can
augment the mime types:
import "mime"
func main() {
// Windows may be missing this
mime.AddExtensionType(".js", "application/javascript")
// And then you create the FileServer like you normally would
http.Handle("/", http.FileServer(http.Dir("static")))
}
After adding the mime fix, remember to force reload Chrome page (hold Control key down while you press refresh), otherwise the problem persists as Chrome does not
really bother reloading the offending files.
I've had an iPad Pro 12.9" for some time, and been happily using it for sheet
music when playing the piano. However, having to interrupt your playing to
swipe to the next page does get annoying. You can get a $100 commercial
AirTurn pedal,
but since one can get a microcontroller from Ebay/Aliexpress for $4 and a
simple foot pedal switch
for $2, I thought it would be a fun one evening hacking project. It turned out
quite nice:
Getting started: ESP32 devkit
The sheet music applications on iPad (and Android) usually have
bluetooth keyboard support, turning the page when user presses an
arrow key or spacebar. So the minimum viable product is just a
bluetooth-enabled microcontroller that can pair with the iPad and
send a single key upon request.
The ESP32 chip has both WiFi and Bluetooth, and I chose it for this project, as
it is readily available in a compact form factor, and it's easy to program with
Arduino. Searching [AliExpress](https://www.aliexpress.com for ESP32 should
give you plenty of options.
I had a ESP32 board labelled with "ESP32 DEVKITV1" in my parts box, and it was
quite easy to set up with this ESP32
tutorial:
Install the driver for USB-UART bridge
Add source URLs for ESP32 to Arduino and install ESP32 support
Select "DOIT ESP32 DEVKIT" from the board menu
Hold down "Boot" button on the board while selecting "Upload",
release when the console says "Connecting..."
Before you proceed with the tutorial, check that you can get the
lights blinking or flash some other example code successfully for
the board. There are plenty of resources around if you hit into
any issues! I had to google the step 4 myself, although it would
have sufficed to read the linked tutorial carefully...