This project intends to provide a complete description and re-implementation of the WhatsApp Web API, which will eventually lead to a custom client. WhatsApp Web internally works using WebSockets; this project does as well.
There's no need to install or manage python and node versions, the file
shell.nix defines an environment with all the dependencies included to run
this project.
There's an .envrc file in root folder that is called automatically when
cding (changing directory) to project, if program direnv is installed along
with nix you should get an output like this:
>cd ~/dev/whatsapp
Installing node modules
npm WARN prepare removing existing node_modules/ before installation
> [email protected] install /home/rainy/dev/whatsapp/node_modules/fsevents
> node-gyp rebuild
make: Entering directory '/home/rainy/dev/whatsapp/node_modules/fsevents/build'
SOLINK_MODULE(target) Release/obj.target/.node
COPY Release/.node
make: Leaving directory '/home/rainy/dev/whatsapp/node_modules/fsevents/build'
> [email protected] postinstall /home/rainy/dev/whatsapp/node_modules/nodemon
> node bin/postinstall || exit 0
added 310 packages in 3.763s
Done.
$$ $$ $$ $$
$$ | $ $$ |$$ | $$ |
$$ |$$$ $$ |$$$$$$$ $$$$$$ $$$$$$ $$$$$$$ $$$$$$ $$$$$$ $$$$$$
$$ $$ $$$$ |$$ __$$ ____$$\_$$ _| $$ _____| ____$$ $$ __$$ $$ __$$
$$$$ _$$$$ |$$ | $$ | $$$$$$$ | $$ | $$$$$$ $$$$$$$ |$$ / $$ |$$ / $$ |
$$$ / $$$ |$$ | $$ |$$ __$$ | $$ |$$ ____$$ $$ __$$ |$$ | $$ |$$ | $$ |
$$ / $$ |$$ | $$ |$$$$$$$ | $$$$ |$$$$$$$ |$$$$$$$ |$$$$$$$ |$$$$$$$ |
__/ __|__| __| _______| ____/ _______/ _______|$$ ____/ $$ ____/
$$ | $$ |
$$ | $$ |
__| __|
Node v13.13.0
Python 2.7.17
Try running server with: npm start
[nix-shell:~/dev/whatsapp]$ If you don't use direnv or just want to manually get into the build
environment do:
nix-shellin the project root
Before you can run the application, make sure that you have the following software installed:
async await syntax is used)pip packages installed:
websocket-client and git+https://github.com/dpallot/simple-websocket-server.git for acting as WebSocket server and client.curve25519-donna and pycrypto for the encryption stuff.pyqrcode for QR code generation.protobuf for reading and writing the binary conversation format.curve25519-donna requires Microsoft Visual C++ 9.0 and you need to copy stdint.h into C:UsersYOUR USERNAMEAppDataLocalProgramsCommonMicrosoftVisual C++ for Python9.0VCinclude.Before starting the application for the first time, run npm install -f to install all Node and pip install -r requirements.txt for all Python dependencies.
Lastly, to finally launch it, just run npm start on Linux based OS's and npm run win on Windows. Using fancy concurrently and nodemon magic, all three local components will be started after each other and when you edit a file, the changed module will automatically restart to apply the changes.
A recent addition is a version of the decryption routine translated to in-browser JavaScript. Run node index_jsdemo.js (just needed because browsers don't allow changing HTTP headers for WebSockets), then open client/login-via-js-demo.html as a normal file in any browser. The console output should show decrypted binary messages after scanning the QR code.
adiwajshing created Baileys, a Node library that implements the WhatsApp Web API.
ndunks made a TypeScript reimplementation at WaJs.
p4kl0nc4t created kyros, a Python package that implements the WhatsApp Web API.
With whatsappweb-rs, wiomoc created a WhatsApp Web client in Rust.
Rhymen created go-whatsapp, a Go package that implements the WhatsApp Web API.
vzaramel created whatsappweb-clj, a Clojure library the implements the WhatsApp Web API.
The project is organized in the following way. Note the used ports and make sure that they are not in use elsewhere before starting the application.
WhatsApp Web encrypts the data using several different algorithms. These include AES 256 CBC, Curve25519 as Diffie-Hellman key agreement scheme, HKDF for generating the extended shared secret and HMAC with SHA256.
Starting the WhatsApp Web session happens by just connecting to one of its websocket servers at wss://w[1-8].web.whatsapp.com/ws (wss:// means that the websocket connection is secure; w[1-8] means that any number between 1 and 8 can follow the w). Also make sure that, when establishing the connection, the HTTP header Origin: https://web.whatsapp.com is set, otherwise the connection will be rejected.
When you send messages to a WhatsApp Web websocket, they need to be in a specific format. It is quite simple and looks like messageTag,JSON, e.g. 1515590796,["data",123]. Note that apparently the message tag can be anything. This application mostly uses the current timestamp as tag, just to be a bit unique. WhatsApp itself often uses message tags like s1, 1234.--0 or something like that. Obviously the message tag may not contain a comma. Additionally, JSON objects are possible as well as payload.
To log in at an open websocket, follow these steps:
clientId, which needs to be 16 base64-encoded bytes (i.e. 25 characters). This application just uses 16 random bytes, i.e. base64.b64encode(os.urandom(16)) in Python.messageTag,["admin","init",[0,3,2390],["Long browser description","ShortBrowserDesc"],"clientId",true].
messageTag and clientId by the values you chose before[0,3,2390] part specifies the current WhatsApp Web version. The last value changes frequently. It should be quite backwards-compatible though."Long browser description" is an arbitrary string that will be shown in the WhatsApp app in the list of registered WhatsApp Web clients after you scan the QR code."ShortBrowserDesc" has not been observed anywhere yet but is arbitrary as well.status: should be 200ref: in the application, this is treated as the server ID; important for the QR generation, see belowttl: is 20000, maybe the time after the QR code becomes invalidupdate: a boolean flagcurr: the current WhatsApp Web version, e.g. 0.2.7314time: the timestamp the server responded at, as floating-point milliseconds, e.g. 1515592039037.0curve25519.Private().privateKey.get_public().ref attribute from step 4base64.b64encode(publicKey.serialize())pyqrcode) and scan it using the WhatsApp app.ttl).messageTag,["admin","Conn","reref"].status: should be 200 (other ones: 304 - reuse previous ref, 429 - new ref denied)ref: new refttl: expiration timeConn: array contains JSON object as second element with connection information containing the following attributes and many more:
battery: the current battery percentage of your phonebrowserToken: used to logout without active WebSocket connection (not implemented yet)clientToken: used to resuming closed sessions aka "Remember me" (not implemented yet)phone: an object with detailed information about your phone, e.g. device_manufacturer, device_model, os_build_number, os_versionplatform: your phone OS, e.g. androidpushname: the name of yours you provided WhatsAppsecret (remember this!)serverToken: used to resuming closed sessions aka "Remember me" (not implemented yet)wid: your phone number in the chat identification format (see below)Stream: array has four elements in total, so the entire payload is like ["Stream","update",false,"0.2.7314"]Props: array contains JSON object as second element with several properties like imageMaxKBytes (1024), maxParticipants (257), videoMaxEdge (960) and otherssecret from Conn as base64 and storing it as secret. This decoded secret will be 144 bytes long.sharedSecret. The application does it using privateKey.get_shared_key(curve25519.Public(secret[:32]), lambda a:a).sharedSecret to 80 bytes using HKDF. Call this value sharedSecretExpanded.HmacSha256(sharedSecretExpanded[32:64], secret[:32] + secret[64:]). Compare this value to secret[32:64]. If they are not equal, abort the login.sharedSecretExpanded[64:] + secret[64:] as keysEncrypted.sharedSecretExpanded[:32] as key, i.e. store AESDecrypt(sharedSecretExpanded[:32], keysEncrypted) as keysDecrypted.keysDecrypted variable is 64 bytes long and contains two keys, each 32 bytes long. The encKey is used for decrypting binary messages sent to you by the WhatsApp Web server or encrypting binary messages you send to the server. The macKey is needed to validate the messages sent to you:
encKey: keysDecrypted[:32]macKey: keysDecrypted[32:64]init command, check whether you have serverToken and clientToken.messageTag,["admin","login","clientToken","serverToken","clientId","takeover"]{"status": 200}. Other statuses:
tos field in the JSON: if it equals or greater than 2, you have violated TOSserverToken and clientToken, you will be challenged to confirm that you still have valid encryption keys.messageTag,["Cmd",{"type":"challenge","challenge":"BASE_64_ENCODED_STRING=="}]challenge string from Base64, sign it with your macKey, encode it back with Base64 and send messageTag,["admin","challenge","BASE_64_ENCODED_STRING==","serverToken","clientId"]{"status": 200}, but it means nothing.goodbye,,["admin","Conn","disconnect"].encKey with your macKey and encode it with Base64. Let's say it is your logoutToken.https://dyn.web.whatsapp.com/logout?t=browserToken&m=logoutTokenNow that you have the two keys, validating and decrypting messages the server sent to you is quite easy. Note that this is only needed for binary messages, all JSON you receive stays plain. The binary messages always have 32 bytes at the beginning that specify the HMAC checksum. Both JSON and binary messages have a message tag at their very start that can be discarded, i.e. only the portion after the first comma character is significant.
macKey (here messageContent is the entire binary message): HmacSha256(macKey, messageContent[32:]). If this value is not equal to messageContent[:32], the message sent to you by the server is invalid and should be discarded.encKey: AESDecrypt(encKey, messageContent[32:]).The data you get in the final step has a binary format which is described in the following. Even though it's binary, you can still see several strings in it, especially the content of messages you sent is quite obvious there.
The Python script backend/decoder.py implements the MessageParser class. It is able to create a JSON structure out of binary data in which the data is still organized in a rather messy way. The section about Node Handling below will discuss how the nodes are reorganized afterwards.
MessageParser initially just needs some data and then processes it byte by byte, i.e. as a stream. It has a couple of constants and a lot of methods which all build on each other.
[None,None,None,"200","400","404","500","501","502","action","add", "after","archive","author","available","battery","before","body", "broadcast","chat","clear","code","composing","contacts","count", "create","debug","delete","demote","duplicate","encoding","error", "false","filehash","from","g.us","group","groups_v2","height","id", "image","in","index","invis","item","jid","kind","last","leave", "live","log","media","message","mimetype","missing","modify","name", "notification","notify","out","owner","participant","paused", "picture","played","presence","preview","promote","query","raw", "read","receipt","received","recipient","recording","relay", "remove","response","resume","retry","s.whatsapp.net","seconds", "set","size","status","subject","subscribe","t","text","to","true", "type","unarchive","unavailable","url","user","value","web","width", "mute","read_only","admin","creator","short","update","powersave", "checksum","epoch","block","previous","409","replaced","reason", "spam","modify_tag","message_info","delivery","emoji","title", "description","canonical-url","matched-text","star","unstar", "media_key","filename","identity","unread","page","page_count", "search","media_message","security","call_log","profile","ciphertext", "invite","gif","vcard","frequent","privacy","blacklist","whitelist", "verify","location","document","elapsed","revoke_invite","expiration", "unsubscribe","disable"]- for 10, . for 11 and