Code and Life

Programming, electronics and other cool tech stuff

Supported by

Supported by Picotech

Configuring CryptoJS to Use SHA256 HMAC with PBKDF2 for Added Security

The security of sensitive information is of utmost importance today. One way to enhance the security of stored passwords is by using PBKDF2 with SHA256 HMAC, a cryptographic algorithm that adds an extra layer of protection if the password hashes are compromised. I covered recently how to calculate PBKDF2 yourself with Python, but today needed to do the same with Javascript.

Having just tried out CryptoJS to do some AES decryption, I thought I'll try the library's PBKDF2() function as well.

CryptoJS documentation on PBKDF2 is as scarce as everything on this library, and trying out the 256 bit key example with Node.js gives the following output:

$ node
Welcome to Node.js v19.6.0.
Type ".help" for more information.
> const crypto = require('crypto-js')
undefined
> const key = crypto.PBKDF2("password", "salt", {keySize: 256/32, iterations: 4096})
undefined
> crypto.enc.Hex.stringify(key)
'4b007901b765489abead49d926f721d065a429c12e463f6c4cd79401085b03db'

Now let's recall what Python gave here:

$ python
Python 3.10.7 (tags/v3.10.7:6cc6b13, Sep  5 2022, 14:08:36) [MSC v.1933 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 4096).hex()
'c5e478d59288c841aa530db6845c4c8d962893a001ce4e11a4963873aa98134a'
>>>

Uh-oh, they don't match! Now looking at pbkdf2.js in the CryptoJS Github source, we can see that the algorithm defaults to SHA-1:

 cfg: Base.extend({
            keySize: 128/32,
            hasher: SHA1,
            iterations: 1
        }),

Seeing how keySize and iterations are overridden, we only need to locate the SHA256 in proper namespace to guide the implementation to use SHA256 HMAC instead:

$ node
Welcome to Node.js v19.6.0.
Type ".help" for more information.
> const crypto = require('crypto-js')
undefined
> const key = crypto.PBKDF2("password", "salt", {keySize: 256/32,
...     iterations: 4096, hasher: crypto.algo.SHA256})
undefined
> crypto.enc.Hex.stringify(key)
'c5e478d59288c841aa530db6845c4c8d962893a001ce4e11a4963873aa98134a'

Awesome! Now we are ready to rock'n'roll with a proper hash function.

Read post

Recreating Chris Veness' AES256 CTR decryption with CryptoJS for fun and profit

A quick one tonight. Having just spent enjoyable hacking time to reverse engineer Chris Veness' AES 256 encryption library enough to be able to decrypt some old data I had using CryptoJS, I thought I will share it with the world to enjoy.

Now Chris' library is nice and simple, you can just encrypt stuff with AES 256 counter mode with a single line of code:

> AESEncryptCtr('Well this is fun!', 'password', 256)
'SdzeY4GBgYHDEWay4JdHr/CnwwnAoBfjQA=='

Now AES 256 is a super standard cipher, so it should be pretty easy to decrypt that with another libray, right?

WRONG!

Wrong, standard AES crypto is not always easy to decrypt

Turns out that Chris' library does not use the 'password' in a way most other libraries use it, but instead chooses to:

  1. Decode the string into UTF8
  2. Create a 16 byte array and put the decoded string into beginning of the array
  3. Initialize AES encryption ("key expansion") using this array
  4. Encrypt a copy of the array as a single AES block using the key expansion (that the library has internally just been initialized with)
  5. Expand the AES encrypted 16 bytes by doubling the array into 32 byte one
  6. Use the 32 byte array as the AES key

Now if you think that any other library would use the same method, you would be dead wrong. Also, most libraries do not expose the same set of functions to replicate this process in any simple manner.

To add insult to injury, JavaScript has so poor support for byte data that it seems each crypto library uses its own internal representation of binary data. For example, CryptoJS likes to make a uint32 array, so [1, 2, 3, 4, 5, 6, 7, 8] is represented as {words: [0x01020304, 0x05060708], sigBytes: 8}!

Thankfully, yours truly is a true master and after just 2 hours of trial and error, I managed to produce this golden nugget:

const crypto = require('crypto-js');

function VanessKey(password) {
  const iv = { words: [0,0,0,0], sigBytes: 16};
  const encrypted = crypto.AES.encrypt(
    crypto.enc.Utf8.parse(password.padEnd(16, '\0')),
    crypto.enc.Utf8.parse(password.padEnd(32, '\0')), { iv }
  ).ciphertext;
  const ew = encrypted.words.slice(0, 4);
  // double the first 4 words of the encrypted
  encrypted.words = ew.concat(ew);
  return encrypted;
}

const key = VanessKey('password');
console.log(key);
console.log(crypto.enc.Hex.stringify(key));

Saving it as decrypt.cjs and running node decrypt.cjs should produce the "Vaness key" for 'password':

$ node poista.cjs
{
  words: [
    233507778, -213013704,
     65782802, -856145110,
    233507778, -213013704,
     65782802, -856145110
  ],
  sigBytes: 32
}
0deb0bc2f34dab3803ebc412ccf8432a0deb0bc2f34dab3803ebc412ccf8432a

Using the key to decrypt the encrypted data

Now we only need the magic incantation to decrypt the Base64 encoded data SdzeY4GBgYHDEWay4JdHr/CnwwnAoBfjQA== produced in the intro! This is also pretty non-straightforward:

Read post

Decoding 433 MHz Radio Signals with Raspberry Pi 4 and Picoscope Digital Oscilloscope

PicoScope 2208B MSO and Raspberry Pi 4 Finale time! After analyzing the Nexa 433 MHz smart power plug remote control signal with Arduino Uno and regular USB soundcard, it is time to try some heavier guns: Raspberry Pi 4 and PicoScope 2208B MSO.

I was initially sceptical on using a Raspberry Pi for analyzing signals, due to several reasons:

  1. Most GPIO projects on RaspPi seem to use Python, which is definitely not a low-latency solution, especially compared to raw C.
  2. Having done a raw C GPIO benchmark on RaspPi in the past, the libraries were indeed quite... low level.
  3. I had serious doubts that a multitasking operating system like Linux running on the side of time-critical signal measurements might impact performance (it is not a RTOS after all).

However, there are projects like rpi-rf that seem to work, so I dug in and found out some promising aspects:

  • There is a interrupt driven GPIO edge detection capability in the RaspPi.GPIO library that should trigger immediately on level changes.
  • Python has a sub-microsecond precision time.perf_counter_ns() that is suitable for recording the time of the interrupt

Python code to record GPIO signals in Raspberry Pi

While rpi-rf did not work properly with my Nexa remote, taking hints from the implementation allowed me to write pretty concise Python script to capture GPIO signals:

from RPi import GPIO
import time, argparse

parser = argparse.ArgumentParser(description='Analyze RF signal for Nexa codes')
parser.add_argument('-g', dest='gpio', type=int, default=27, help='GPIO pin (Default: 27)')
parser.add_argument('-s', dest='secs', type=int, default=3, help='Seconds to record (Default: 3)')
parser.add_argument('--raw', dest='raw', action='store_true', default=False, help='Output raw samples')
args = parser.parse_args()

times = []

GPIO.setmode(GPIO.BCM)
GPIO.setup(args.gpio, GPIO.IN)
GPIO.add_event_detect(args.gpio, GPIO.BOTH,
        callback=lambda ch: times.append(time.perf_counter_ns()//1000))

time.sleep(args.secs)
GPIO.remove_event_detect(args.gpio)
GPIO.cleanup()

# Calculate difference between consecutive times
diff = [b-a for a,b in zip(times, times[1:])]

if args.raw: # Print a raw dump
    for d in diff: print(d, end='\n' if d>5e3 else ' ')

The code basically parses command line arguments (defaulting to GPIO pin 27 for input) and sets up GPIO.add_event_detect() to record (microsecond) timings of the edge changes on the pin. In the end, differences between consecutive times will be calculated to yield edge lengths instead of timings.

Raspberry Pi is nice also due to the fact that it has 3.3V voltage available on GPIO. Wiring up the 433 MHz receiver was a pretty elegant matter (again, there is a straight jumper connection between GND and last pin of the receiver):

Raspberry Pi wired to 433 MHz receiver

  1. Wire the receiver in
  2. Start the script with python scan.py --raw (use -h option instead for help on command)
  3. Press the Nexa remote button 1 during the 3 second recording interval

Here's how the output looks like (newline is added after delays longer than 5 ms):

Read post

Using a Soundcard as an Oscilloscope to Decode 433 MHz Smart Power Plug Remote Control Signals

I previously covered how to decode Nexa smart power plug remote control signals using Arduino Uno. The drawback of Arduino was the limited capture time and somewhat inaccurate timing information (when using a delay function instead of an actual hardware interrupt timer).

Another way to look at signals is an oscilloscope, and a soundcard can work as a "poor man's oscilloscope" in a pinch. You wire the signal to left/right channel (you could even wire two signals at the same time) and "measure" by recording the audio. It helps if ground level is shared between the soundcard and the signal source. In this case I achieved it pretty well by plugging the Arduino that still provides 3.3V voltage for the 433 MHz receiver chip to the same USB hub as the soundcard.

Another thing to consider is the signal level. Soundcard expects the voltage to be around 0.5V, whereas our receiver is happily pushing full 3.3V of VCC when the signal is high. This can be fixed with a voltage divider -- I wired the signal via a 4.7 kOhm resistor to the audio plug tip, and a 1 kOhm resistor continues to GND -- effectively dropping the voltage to less than a fifth of original.

The audio plug "sleeve" should be connected to GND as L/R channel voltages are relative to that. There was a 0.1V difference in voltage between soundcard sleeve and Arduino GND so I decided to put a 1 kOhm resistor also here to avoid too much current flowing through a "ground loop".

Note that I'm using a SparkFun breakout for the plug that makes connecting the plug to a breadboard quite easy. If you need to wire it directly to the connector (signal to "tip" and ground to "sleeve"), refer to diagram here: https://en.wikipedia.org/wiki/Phone_connector_(audio). A word of caution: You can break stuff if you wire it incorrectly or provide too high voltage to some parts, so be careful and proceed at your own risk!

Soundcard as an Oscilloscope

You can see the circuit in the picture above. Note that due to receiver needing GND in two places, I've been able to connect the sleeve to another GND point (rightmost resistor) than the other 1 kOhm resistor that is part of the voltage divider (leftmost resistor)

Trying it out with Audacity

Once you have power to the receiver, and 1 kOhm resistors to sleeve and tip and signal through a 4.7 kOhm (anything between that and 22 kOhm works pretty well with 16 bits for our ON/OFF purposes), you can start a recording software like Audacity to capture some signals. Here is a sample recording where I've pressed the remote a couple of times:

Audacity example

As you can see, our signal is one-sided (audio signals should be oscillating around GND, where as our is between GND and +0.5V) which causes the signal to waver around a bit. If you want a nicer signal you can build a somewhat more elaborate circuit but this should do for our purposes.

By the way, you can use ctrl-A to select all of the recorded audio and use the "Amplify" effect (with default settings) maximize signal to available audio headroom. Makes looking at the signals a little easier. I've done that before the captures below. Another issue becomes apparent when we zoom in a bit closer:

Audacity example two: zoomed in

Due to automatic gain in the receiver chip (or maybe some other anomaly), the receiver consistently captures some "garbage signal" some 200 ms after the "official" Nexa signal we got a glimpse of in the [previous part of this series]. It is not pure noise but seems to contain some other signal, probably coming somewhere else in the building but at a much lower power level -- the receiver records this as well, but blocks it for a while once it gets the more powerful Nexa signal.

To analyze the Nexa signal, I suggest deleting other parts of audio capture than the actual signals. I actually tore open the Nexa remote and measured the signal straight from the antenna to make sure this recurring "tail" is not coming from the remote. :D

Below you can see a zoom-in of the signal at various levels of magnification. Note that I'm not zooming in on the first signal that has the "spike" before the actual signal starts.

Nexa remote signal zoomed in Nexa remote signal zoomed in more Nexa remote signal zoomed in even more

As you can see, the 44.1 kHz signal has enough resolution to deduce the signal protocol just with Audacity. But countint the individual samples from HIGH and LOW segments to get an average length is pretty tedious. If only we could do it programmatically...

Using Python to analyze the recorded signal

I made a handy Python script to analyze the recorded signal. Basic procedure is:

Read post

Mastering the Huggingface CLIP Model: How to Extract Embeddings and Calculate Similarity for Text and Images

Article image, neural network transforming images and text into vector data

Huggingface's transformers library is a great resource for natural language processing tasks, and it includes an implementation of OpenAI's CLIP model including a pretrained model clip-vit-large-patch14. The CLIP model is a powerful image and text embedding model that can be used for a wide range of tasks, such as image captioning and similarity search.

The CLIPModel documentation provides examples of how to use the model to calculate the similarity of images and captions, but it is less clear on how to obtain the raw embeddings of the input data. While the documentation provides some guidance on how to use the model's embedding layer, it is not always clear how to extract the embeddings for further analysis or use in other tasks.

Furthermore, the documentation does not cover how to calculate similarity between text and image embeddings yourself. This can be useful for tasks such as image-text matching or precalculating image embeddings for later (or repeated) use.

In this post, we will show how to obtain the raw embeddings from the CLIPModel and how to calculate similarity between them using PyTorch. With this information, you will be able to use the CLIPModel in a more flexible way and adapt it to your specific needs.

Benchmark example: Logit similarity score between text and image embeddings

Here's the example from CLIPModel documentation we'd ideally like to split into text and image embeddings and then calculate the similarity score between them ourselves:

from PIL import Image
import requests
from transformers import AutoProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(
    text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True
)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image  # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1)  # we can take the softmax to get the label probabilities

If you run the code and print(logits_per_image) you should get:

tensor([[18.9041, 11.7159]], grad_fn=<PermuteBackward0>)

The code calculating the logits is found in forward() function source

Acquiring image and text features separately

There are pretty promising looking examples in get_text_features() and get_image_features() that we can use to get CLIP features for either in tensor form:

from PIL import Image
import requests
from transformers import AutoProcessor, AutoTokenizer, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")

# Get the text features
tokenizer = AutoTokenizer.from_pretrained("openai/clip-vit-large-patch14")

inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
text_features = model.get_text_features(**inputs)

print(text_features.shape) # output shape of text features

# Get the image features
processor = AutoProcessor.from_pretrained("openai/clip-vit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(images=image, return_tensors="pt")

image_features = model.get_image_features(**inputs)

print(image_features.shape) # output shape of image features

Running this should yield the following output:

$ python cliptest.py
torch.Size([2, 768])
torch.Size([1, 768])

Looks pretty good! Two 768 item tensors for the two labels, and one similarly sized for the image! Now let's see if we can calculate the similarity between the two...

Read post

Analyzing 433 MHz Nexa Smart Power Plug Remote Control Signal with Arduino Uno

A friend recently started a project to remotely boot his router (which tends to hang randomly) with Raspberry Pi. Unfortunately, the rpi-rf tool was not quite recognizing the signals. I pitched in to help, and as he did not have access to an oscilloscope, but had an Arduino Uno, I thought maybe I could figure it out with that.

Fast forward a few weeks later, I have been experimenting with four methods analyzing my own Nexa 433 MHz remote controller:

  1. Arduino Uno
  2. Soundcard a.k.a. "poor man's oscilloscope"
  3. Raspberry Pi
  4. An actual oscilloscope, namely my Picoscope 2208B

Having learned a lot, I thought to document the process for others to learn from, or maybe even hijack to analyze their smart remotes. In this first part, I will cover the process with Arduino Uno, and the following posts will go through the other three methods.

Starting Simple: Arduino and 433 MHz receiver

Having purchased a rather basic Hope Microelectronics (RFM210LCF-433D) 3.3V receiver for the 433 MHz spectrum signals, it was easy to wire to Arduino:

  1. Connect GND and 3.3V outputs from Arduino to GND and VCC
  2. Connect Arduino PIN 8 to DATA on the receiver
  3. Connect a fourth "enable" pin to GND as well to turn the receiver on

You can see the setup here (larger version):

Arduino wired to Hope 433 MHz rf receiver

I wrote a simple Arduino script that measures the PIN 8 voltage every 50 microseconds (20 kHz), recording the length of HIGH/LOW pulses in a unsigned short array. Due to memory limitation of 2 kB, there is only space for about 850 edges, and the maximum length of a single edge is about 65 000 samples, i.e. bit more than three seconds.

Once the buffer is filled with edge data or maximum "silence" is reached, the code prints out the data over serial, resets the buffer and starts again, blinking a LED for 5 seconds so you know when you should start pressing those remote control buttons. Or perhaps "press a button", as at least my Nexa pretty much fills the buffer with a single key press, as it sends the same data of about 130 edges a minimum of 5 times, taking almost 700 edges!

It also turned out that the "silence" limit is rarely reached, as the Hope receiver is pretty good at catching stray signals from other places when there is nothing transmitting nearby (it likely has automatic sensitivity to "turn up the volume" if it doesn't hear anything).

#define inputPin 8 // Which pin to monitor

void setup() {
  pinMode(LED_BUILTIN, OUTPUT);
  Serial.begin(38400);
}

#define BUFSIZE 850
#define WAIT_US 50 // 20 kHz minus processing overhead

unsigned short buf[BUFSIZE];
short pos = 0;
bool measure = false;

void loop() {
  if(measure) {
    unsigned char data = digitalRead(inputPin) == HIGH ? 1 : 0;
    if(data == (pos&1))
      pos++; // data HIGH and odd pos or LOW and even -> advance
    
    if(buf[pos] > 65000 || pos >= BUFSIZE) {
      measure = false;
      return; // avoid overflow
    }
    
    buf[pos]++;
    delayMicroseconds(WAIT_US); // Wait until next signal
  } else { // end measure
    digitalWrite(LED_BUILTIN, LOW); // LED OFF = dumping
    Serial.println();
    Serial.print("N =");
    Serial.println(pos, DEC);
    for(int i=0; i<pos && i<BUFSIZE; i++) {
      Serial.print(buf[i], DEC);
      if(buf[i] > 100000/WAIT_US)
        Serial.println(); // 100 ms delays get a newline
      else
        Serial.print(' ');
      buf[i] = 0; // reset
    }
    pos = 0; // restart
    measure = true;
    for(int i=0; i<10; i++) { // 5s blink before next measurement 
      digitalWrite(LED_BUILTIN, (i&1) ? LOW : HIGH); 
      delay(500); 
    }
    Serial.println("\nGO");
    digitalWrite(LED_BUILTIN, HIGH); // LED ON = measuring
  }
}

Analyzing the data

Here is some sample data from the serial monitor:

Read post