Notice: I wanted to see if OpenAI canvas can do reasonable Markdown editing, so this post is co-written with ChatGPT 4o with Canvas. The code and Fish script were done before writing this separately with the gracious help of our AI overlords as well. I've kept the prose to minimum and edited the result myself, so benefit should still be high, even though manually written content is low.
Recently, I wanted to make my command-line experience a bit more conversational. Imagine writing a comment like # list files, pressing enter, and seeing it magically turn into the corresponding Fish shell command: ls. With OpenAI's API, this becomes not just possible but surprisingly straightforward. And should rid me of jumping to ChatGPT every time I need to remember how find or let alone ffmpeg exactly works.
This blog post walks through creating a Python script called shai that turns natural language comments into Unix commands using OpenAI's API, and then utilizing that script with a Fish shell function to replace a comment written on command line with the actual command. After the command is generated, you can edit it before running it — a nice way to integrate AI without losing control.
Setting up the environment
Before we dive into the script, make sure you have the following:
Python installed (version 3.8 or higher is recommended).
An OpenAI API key. If you don’t have one, sign up at OpenAI.
The OpenAI Python library in a Python virtual environment (or adjust the code below if you prefer something else like pip install openai on your global env):
$ python3 -m venv /home/joonas/venvs/myenv
$ source /home/joonas/venvs/myenv/bin/activate.fish # or just activate with bash
$ pip install openai
A configuration file named openai.ini with your API key and model settings, structured like this:
[shai]
api_key = your-openai-api-key
model = gpt-4o-mini
The Python script
Here’s the Python script, shai, that interprets natural language descriptions and returns Unix commands:
#!/home/joonas/venvs/myenv/bin/python
import os
import sys
from openai import OpenAI
import configparser
# Read the configuration file
config = configparser.ConfigParser()
config.read('/home/joonas/openai.ini')
# Initialize the OpenAI client with your API key
client = OpenAI(api_key=config['shai']['api_key'])
def get_unix_command(natural_language_description):
# Define the system prompt for the model
system_prompt = (
"You are an assistant that converts natural language descriptions of tasks into "
"concise, accurate Unix commands. Always output only the Unix command without any "
"additional explanations or text. Your response must be a single Unix command."
)
# Call the OpenAI API with the description
response = client.chat.completions.create(
model=config['shai']['model'],
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": natural_language_description},
],
temperature=0, # To ensure consistent and accurate output
)
# Extract the command from the response
command = response.choices[0].message.content.strip()
return command
def main():
if len(sys.argv) < 2:
print("Usage: shai <natural language description>")
sys.exit(1)
# Get the natural language description from command line arguments
description = " ".join(sys.argv[1:])
try:
# Generate the Unix command
unix_command = get_unix_command(description)
print(unix_command)
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
How it works
Configuration: The script reads an openai.ini file for API credentials and model settings.
Command generation: When you provide a natural language description, the script sends it to OpenAI’s API along with a system prompt specifying the desired output format.
Output: The script returns the corresponding Unix command.
You could place it in e.g. ~/bin and do chmod +x shai to make it runnable, and then test it:
$ shai list files
ls
Extending to Fish shell
To make this functionality seamlessly available in the Fish shell, you can use the following Fish function:
function transform_comment_line
set cmd (commandline)
# Check if line starts with a hash (a comment)
if string match -q "#*" $cmd
# Remove the '#' and possible leading space
set query (string trim (string sub -s 2 $cmd))
# Run your "shai" script (replace 'shai' with the actual command)
# Assuming that 'shai' takes the query as arguments and prints the command
set result (shai $query)
# Replace the current command line with the output of 'shai'
commandline -r $result
# Now your command line is replaced with the generated command.
# The user can edit it further if needed, and press Enter again to run.
else
# If it's not a comment line, just execute normally
commandline -f execute
end
end
Save this function in your Fish configuration directory as .config/fish/functions/transform_comment_line.fish. Then, bind it to a key or trigger it manually to convert comments into executable commands. I am using this in my .config/fish/config.fish to automatically run on enter:
if status is-interactive
# Commands to run in interactive sessions can go here
bind \r transform_comment_line
end
And that is literally it. Enjoy!
Ending was edited for brevity, ChatGPT wanted to rant on how this could become a powerful part of your workflow...
After a bit of AI hiatus, I noticed that llama 3.0 models were released and wanted to try the models. Sure enough, after a week the weights we re available at the official site. However, the Docker image hasn't been used in a while and I wanted to upgrade it without losing the models.
There was almost no information on this available online yet, and even the
ollama docker documentation is quite non-existent — maybe for seasoned
Docker users it is obvious what needs to be done? But not for me, so let's see
if I can manage it.
Upgrading the docker image
First, let's just upgrade the ollama/ollama image:
$ sudo docker pull ollama/ollama
This is nice, but the currently running container is still the old one. Let's stop it:
$ sudo docker stop ollama
Checking the location of the files
I remember I set a custom directory to store the models. Let's check where it is:
As can be seen, the models are stored in /mnt/scratch/docker/volumes/ollama/_data. Let's make a hard-linked copy
of the files into another folder, to make sure we don't lose them:
Having just spent 4 hours trying to get a Python pseudocode version of PBKDF2 to match with hashlib.pbkdf2_hmac() output, I thought I'll post Yet Another Example how to do it. I thought I could just use hashlib.sha256 to calculate the steps, but turns out HMAC is not just a concatenation of password, salt and counter.
So, without further ado, here's a 256 bit key generation with password and salt:
import hashlib, hmac
def pbkdf2(pwd, salt, iter):
h = hmac.new(pwd, digestmod=hashlib.sha256) # create HMAC using SHA256
m = h.copy() # calculate PRF(Password, Salt+INT_32_BE(1))
m.update(salt)
m.update(b'\x00\x00\x00\x01')
U = m.digest()
T = bytes(U) # copy
for _ in range(1, iter):
m = h.copy() # new instance of hmac(key)
m.update(U) # PRF(Password, U-1)
U = m.digest()
T = bytes(a^b for a,b in zip(U,T))
return T
pwd = b'password'
salt = b'salt'
# both should print 120fb6cffcf8b32c43e7225256c4f837a86548c92ccc35480805987cb70be17b
print(pbkdf2(pwd, salt, 1).hex())
print(hashlib.pbkdf2_hmac('sha256', pwd, salt, 1).hex())
# both should print c5e478d59288c841aa530db6845c4c8d962893a001ce4e11a4963873aa98134a
print(pbkdf2(pwd, salt, 4096).hex())
print(hashlib.pbkdf2_hmac('sha256', pwd, salt, 4096).hex())
Getting from pseudocode to actual working example was surprisingly hard, especially since most implementations on the web are on lower level languages, and Python results are mostly just using a library.
Simplifying the pseudo code further
If you want to avoid the new...update...digest and skip the hmac library altogether,
the code becomes even simpler. HMAC is quite simple
to implement with Python. Here's gethmac function hard-coded to SHA256 and an even shorter pbkdf2:
Huggingface'stransformers library is a great resource for natural language processing tasks, and it includes an implementation of OpenAI's CLIP model including a pretrained model clip-vit-large-patch14. The CLIP model is a powerful image and text embedding model that can be used for a wide range of tasks, such as image captioning and similarity search.
The CLIPModel documentation provides examples of how to use the model to calculate the similarity of images and captions, but it is less clear on how to obtain the raw embeddings of the input data. While the documentation provides some guidance on how to use the model's embedding layer, it is not always clear how to extract the embeddings for further analysis or use in other tasks.
Furthermore, the documentation does not cover how to calculate similarity between text and image embeddings yourself. This can be useful for tasks such as image-text matching or precalculating image embeddings for later (or repeated) use.
In this post, we will show how to obtain the raw embeddings from the CLIPModel and how to calculate similarity between them using PyTorch. With this information, you will be able to use the CLIPModel in a more flexible way and adapt it to your specific needs.
Benchmark example: Logit similarity score between text and image embeddings
Here's the example from CLIPModel documentation we'd ideally like to split into text and image embeddings and then calculate the similarity score between them ourselves:
from PIL import Image
import requests
from transformers import AutoProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(
text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True
)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
If you run the code and print(logits_per_image) you should get:
from PIL import Image
import requests
from transformers import AutoProcessor, AutoTokenizer, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
# Get the text features
tokenizer = AutoTokenizer.from_pretrained("openai/clip-vit-large-patch14")
inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
text_features = model.get_text_features(**inputs)
print(text_features.shape) # output shape of text features
# Get the image features
processor = AutoProcessor.from_pretrained("openai/clip-vit-large-patch14")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
image_features = model.get_image_features(**inputs)
print(image_features.shape) # output shape of image features
Looks pretty good! Two 768 item tensors for the two labels, and one similarly sized for the image! Now let's see if we can calculate the similarity between the two...
In recent years, the use of graphics processing units (GPUs) has led to the adoption of methods like PBKDF2 (Password-Based Key Derivation Function 2) for secure password storage. PBKDF2 is a key derivation function that is designed to be computationally expensive in order to slow down dictionary attacks and other brute force attacks on passwords. With the increase in processing power that GPUs provide, PBKDF2 has become a popular choice for password storage.
As the development of processing power continues to advance, it has become necessary to increase the number of iterations used in PBKDF2 in order to maintain a high level of security. With more iterations, it becomes even more difficult for an attacker to crack a password using brute force methods.
Recently, I had an idea. What if it were possible to run PBKDF2 arbitrarily long and print out points that match certain criteria? This could potentially provide an even higher level of security for password storage, as the number of iterations could be increased to levels that would make brute force attacks infeasible. It's an idea worth exploring and I'm excited to see what the future holds for PBKDF2 and other password security measures.
Bitcoin difficulty
One of the key features of the Bitcoin network is its use of difficulty to scale the hardness of block signing based on the number of computers that are currently mining. In other words, as more computers join the network and begin trying to solve the cryptographic puzzles required to add new blocks to the blockchain, the difficulty of these puzzles increases in order to maintain a consistent rate of block creation. This ensures that the network remains secure and resistant to attacks, even as the number of miners grows over time.
The basic idea behind this technique is fairly simple: by requiring that a certain number of zeros be added to the block hash, the complexity of the puzzle increases in powers of two. Every hash is essentially
random, and modifying the hashed data by the tiniest bit results in a new hash. Every other hash ends in zero, and every other in one. With two zero bits, it's every 4th. To zero a full byte (8 bits) you already need 256 (2^8) tries. With three bytes, it's already close to 17 million.
Printing out PBKDF2 steps at deterministic points
Combining the two ideas is one way to deterministically create encryption keys of increasing difficulty:
Continuing the awesome and not so unique stream of ideas on what to do with
ChatGPT, here's a bit modified take to my previous post on self-running ChatGPT generated Python code.
This time, let's do a shell script that takes a description of what you want as a shell command, and returns just that command. Here's how it will work:
$ shai find latest 3 files
46 total tokens
ls -lt | head -n 3
$ ls -lt | head -n 3
total 1233
-rwxrwxrwx 1 root root 5505 huhti 4 2023 python-chatgpt-ai.md
-rwxrwxrwx 1 root root 10416 maalis 26 2023 golang-sveltekit-chatgpt.md
As seen, this time I'm not running the command automatically, but just returning the command. This is a bit safer, as you can inspect the command before running it. The script is quite simple:
#!/usr/bin/env python
import sys, os
from openai import OpenAI
from configparser import ConfigParser
# Read the config file openai.ini from same directory as this script
path = os.path.dirname(os.path.realpath(__file__))
config = ConfigParser()
config.read(path + '/openai.ini')
client = OpenAI(api_key=config['openai']['api_key'])
prompt = ' '.join(sys.argv[1:])
role = ('You are Ubuntu linux shell helper. Given a question, '
'answer with just a shell command, nothing else.')
chat_completion = client.chat.completions.create(
messages=[ { "role": "system", "content": role },
{ "role": "user", "content": prompt } ],
model = config['shai']['model']
)
print(chat_completion.usage.total_tokens, 'tokens:')
print(chat_completion.choices[0].message.content)
I decided GPT 3.5 Turbo is a good model for this, as it should be able to handle shell commands quite well. You also need to have a openai.ini file in the same directory as the script, with the following content:
[openai]
api_key = sk-xxx
[shai]
model = gpt-3.5-turbo
To use it, just install the OpenAI Python package with pip install openai, and then you can use the script like this:
$ chmod +x shai
$ ./shai find latest 3 files
And putting the script in your path, you can use it like any other shell
command. Enjoy!