Serenyx

Does this text color make it hard to read? Click here. Serenyx is a symbolic AI chat bot without machine learning. It has a PHP API that takes care of generating responses based on the user's input. It has gone through multiple iterations, each one increasing its efficiency, response time, and functions, while removing ones that weren't used (for example, it used to be able to SSH into my phone and backup everything, but since transfering data over SSH is slower than just using a USB-C cable, I opted to remove this function from the bot). Below, you can find images that show all the functions the bot is capable of, but I have removed some functions that might jeopardize the security of the bot and could possibly be exploited. The "/clear" command empties the conversation file that's currently being used, essentially deleting the conversation history. The "/size" command can both return the size of the conversation file, or the amount of free space left on the storage device the bot is currently on. "/bot" allows me to do a host of things related to the bot, such as revoke access tokens, request them, turn the bot off, or request a download link for its APK file to install on Android devices. The other two commands are self explanatory, "/ytd" using the YTD library to download YouTube videos, and "/yt" using YouTube's own API to search for videos, and then utilizing the YTD library to provide download links for the top five relevant video results of the search. The "/list" command can be used to create, modify, view, and delete lists. These lists can be used for anything, from reminders, to shopping lists. The next four commands can be used to generate BCrypt, SHA256, SHA512, and MD5 hashes based on a given piece of text (string). There are also two functions to encode and decode strings into and from Base64. The functions above are once again self-explanatory. "/shuffle" just randomly moves around the characters of a given string. "/define" can be used to see the definition of a word. The bot searches through the English dictionary, and outputs the most relevant definitions. "/server" commands include restarting my Apache server, or restarting the device it's hosted on. "/website" commands let me toggle caching on my website (for CSS files and the like), toggle development mode, delete all self-hosted uploaded files etc. "/lights" allows me to control the smart bulb in my bedroom. I can change the light bulb's color, brightness, and of course turn it on or off. "/color" commands let the user convert hex to RGB, find the hex value of a color by its name (for example, you can say "/color name white" and it'd return #FFFFFF), which is extremely useful for web development. The above commands are related to cryptocurrencies. As I have a lot of wallets for different currencies, I can use the "/wallet" command along with the currency's ticker (BTC for Bitcoin, ETH for Ethereum etc.) to see what my public wallet address is for that crypto. "/mcap" can be used to view the total market capitalization of cryptos using CoinMarketCap's API. "/{Ticker} = {Amount}" lets me tell my bot how much of a certain crypto I hold (for example, I can do "/eth = 1", which would tell the bot that I hold one Ether token). "/crypto prices" outputs the current prices of the cryptocurrencies I'm interested in, again, the prices are provided by CoinMarketCap. The "/balance" command lets me check how much money (in USD) I have. It gets the amount of crypto I hold, and then for each one, it checks the price on CoinMarketCap, and multiplies my holdings by the price, to get the USD value; it then sums up the values and returns a total. In previous iterations, it used another API to get the current exchange rate between GBP and USD so it could output the total in both currencies, but I have removed that function as I only trade cryptos using USD, making the GBP value irrelevant. "/status" returns a string such as "up and running" or "online" to let me know the bot is functioning. If it doesn't return anything, then obviously something is wrong. "/birthday" calculates the number of days left until my birthday. "/tv" returns the number of days or months left until the next episode of the shows I'm interested in. The image above is of the login page of the bot's Android app. There is an input field for the login/access key, and a button to submit it. The app's main page is shown above. The web version looks basically identical. There is a hamburger menu icon on the top left that can be used to access the navigation drawer, and a refresh button on the top right that can be used to refetch the conversation log. As you can see from the conversation above, the bot has an understanding of context, so it has the ability to remember what it said last, and therefore can respond to questions such as "and tomorrow?" if the previous message was asking today's weather. It won't output tomorrow's weather if it is just asked "and tomorrow?", it requires context, but it will output tomorrow's weather if you ask "what's tomorrow's weather like?"; this will be explored in depth further below along with the code. The app's settings pane lets the user clear the conversation (same as running the "/clear" command), request the access keys for the bot's different platforms, toggle the app's theme, and logout from the app. When the user taps on a message, a popup window is shown displaying information about that message. The user can delete the message, or copy its content to their clipboard. The information shown includes the message's ID, the date and time at which the message was sent, and its content. The navigation drawer of the app. The user can tap on the attachment icon at the bottom left of the screen to access a popup window which lets the user upload files through the bot. Once the file has been uploaded, the bot outputs a download link to it. I added this feature to be used as a way to quickly upload a file without the ability to delete it without having physical access to the server, so there is no file deletion feature; once the file has been uploaded, it'll be kept until I delete it from the server directly. The next 24 images show the code for the bot's Android app. I coded the app in Java, and it's a native Android app. "FCM" is a class I created to handle the notification functionality of the Serenyx app. When it receives a notification, it checks to see if the data it has to show to the user is provided as a string, or as a JSON array. There is a method called "sendNotification" that displays the notification. It starts by getting the title and the text of the notification. After getting the text and title, it creates a new intent, builds the notification by setting its title, text, alert sound, icons, color, and vibration, finally displaying the notification using the NotificationManager. The bot app also has an uploader, just like the web version. There is a "FileUploader" class that uses the OkHTTP library to send the file to the server using POST. The image above shows part of the code for "LoginActivity", which is the login page. Immediately at the start of the "onCreate" method, a string is defined under the variable "FCMToken", which is the unique ID used for the device the user is on to receive notifications from Firebase. Next, the app preferences are loaded to get the access key and the theme the user has chosen for the app. The "FCMToken" is then saved to the app preferences. The app then switches to the theme the user has chosen using the "toggleTheme" method, which will be explained below. The app is then given a transparent status bar, and the different components of the login page are defined. If the aforementioned access key is found in the app preferences, then the login input field is automatically filled out with the access key, and the user is automatically logged in; this saves me from having to enter an access key every time I open the app. If the login button is clicked, then the "login" method is called with "FCMToken" as an argument, and the keyboard is then hidden. The "toggleTheme" method accepts one parameter, which is the theme color. There are two themes for this app, "default" and "black". Because I use a phone with an OLED screen, I decided to make both a dark theme and a black theme to save battery life. If the theme is set to "default", then background color is set to the primary color, and the input field's background resource is set to "accent_border". If the theme is black, then the background is set to the primary black color, and the input field's background is set to the lighter version of the primary black color. The "login" method accepts "FCMToken" as a parameter, and starts by getting the value of the login input field. Volley is then used to create and send an async POST request to the server with the action, login key, and Firebase token as POST parameters. These are sent to the bot's API. Once a response is received, the app checks to see if the login was successful. If it was, then the MainActivity (the conversation screen) is launched, and the login key is passed to it. The login key is also saved in the app preferences. The Volley POST request has three parameters: action, key, and token. "action" tells the API what action is being requested, in this case, a login. The key is the access key or password, and the token is the Firebase Cloud Messaging token that's unique to the device the app is being run on. Later on, I'll go through the bot's API and explain why this token is needed and how all this works. The "MainActivity" code starts with an override for the "onActivityResult" method. If the requestCode is equal to 10, then the FileUploader class is used to upload a file. Basically, the app has a file browser built into it, and file browsers utilize "onActivityResult" to determine when a user has actually chosen a file. Each result needs a request code, and I set the file browser's request code to 10. Once a file has been chosen, the path to the file is determined, and a new object from the File class is created with the path to the file. Its absolute path is then determined using the File class' "getAbsolutePath" method. This is done because the "getPath" method can actually return a relative path. The MIME type of the file is then determined, and the FileUploader class is used to create a new FileUploader instance. The FileUploader class' "uploadFile" method is then used to upload the file. The "onCreate" method of MainActivity starts by using the "toggleTheme" method to set the theme to the one saved in the app's preferences. Subsequently, a new field is created in the app preferences called "checksum" without a value. The access key that was passed to MainActivity from LoginActivity is then fetched from the MainActivity intent data. This key is then used as a parameter to call the "getConversation" method. "onCreate" is usually called when the app is launched without already being open. In this case, there is an override for the "onStart" method in order to perform some actions each time the app is opened (even with multitasking), regardless of whether or not it was closed beforehand. The "onStart" method calls the "getConversation" method, clears all notifications for the app, and sets the correct theme. The "getConversation" method accepts the access key as a parameter, which it sends to the API to get back a checksum of the conversation file that the server has. This checksum is saved in the app preferences. This local checksum is then compared with the server's, and if they don't match, it means the server has a different version of the conversation, in which case the app's conversation is replaced with the server's version. If the checksums aren't the same, the server returns a JSON array containing both the new checksum, and the messages the user and the bot have sent. An iterator and "while" loop are then used to go through the messages, and for each one, their text content is added to the app's list of messages, which the MessageAdapter class (will be analyzed further below) uses to create the chat bubbles that the user sees. The app also scrolls to the latest message if it isn't in view. The "sendMessage" method is responsible for relaying the user's message to the bot's API, which then generates a response and returns it as a response. This method accepts the access key, and the user's message as parameters. It then uses a POST request to send the text to the API, and once it receives a response, it empties the input field the user can use to enter their message, it then passes along the user message's information to the app's message list, and finally scrolls down to the end of the latest message's chat bubble. If there is an error with sending the message to the server, the input field is given the value of the user's message again so they don't lose what they wrote. After a short delay of 350ms, the app then adds the bot's response to the list and the MessageAdapter creates the chat bubble for it. The delay is added to make the flow of the conversation seem more natural; from a UX perspective, artificial loading times can often create a sense of "trust" between the user and the machine, because an instant reply would make it seem like it's not actually "thinking" about what the user said, and those who have no understanding of how a response is generated in the background might think it's impossible for a reply to be returned instantaneously. The "atBottom" method simply checks to see how many items the conversation list has, and gets the position of the last item that's currently visible. If the last item's position is lower than the item count, then there must be more items out of view, meaning the method will return a value of 0 as the condition isn't true. If, however, the last item's position is higher than or equal to the item count, then it returns 1, as the condition is true, and the user is currently viewing the latest message. This is used to determine if the app should scroll down to the latest message. The "showMessagePopup" method has two parameters consisting of the ID of the message, and its content. The method starts by setting the messageID and messageContent textview values to the arguments that were passed to it. Each message has an ID that's generated using the UNIX timestamp, a hyphen, and a randomly generated 9 character integer, so the first 10 characters of the message ID are always the date and time when the message was sent, and the "showMessagePopup" method uses this to convert the timestamp to a human-readable version in the format of "dd / MM / YYYY" and "HH:mm" which are joined by the word "at" to create a date such as "02 / 11 / 18 at 14:52". Finally, the method sets the background of the popup window to transparent so that the content behind the overlay are visible. The "onMessageClick" method is used in conjunction with an interface in the MessageAdapter class called "OnMessageListener". This is because the popup cannot be created inside the MessageAdapter class since the actual UI is created by "MainActivity". But the chat bubbles textviews are inflated/created using the MessageAdapter, so MainActivity cannot directly access them without using an interface between itself and the MessageAdapter, which MainActivity then "implements". "onMessageClick" accepts the position of the chat bubble item that's being clicked on as a parameter. It starts by hiding the keyboard, getting the ID of the message (which in the app starts with "usr-" or "bot-" to easily distinguish between who "owns" the chat bubble. This prefix is then determined, and the "from" variable is set accordingly (this is used by the API to determine, for example, whose side of the conversation to delete when the user wishes to delete a chat bubble in an interaction). The message's actual ID is then determined by removing the prefix. The "showMessagePopup" method is then called. The message popup has a delete button among other components. When the user clicks on the delete button, a dialog window is shown asking the user if they are sure they want to delete the message. If they click on "Yes", the "deleteMessage" method is called. The "deleteMessage" method accepts the access key, the message ID, and the sender of the message (user or bot) as parameters. It makes a POST request to send this information to the API. Once it gets a response, it calls the "getConversation" method. The way messages are stored and the code that takes care of all these functions will be analyzed and explained further below. The "MessageAdapter" class is responsible for creating or "inflating" the textviews that are displayed as chat bubbles to the user. It contains a list called "responseMessageList" that contains the user's and bot's messages. When it creates a textview/chat bubble, it gives the textview an ID that consists of either the prefix "usr-" or "bot-" combined with the message's actual server-side recognized ID, which it retrieves using the ResponseMessage class methods. It then makes any possible links clickable in the textview. The ResponseMessage class essentially keeps track of each message's attributes. It can then be used by other methods to retrieve information about a message by providing its position in the list. The next 7 images show the PHP code of the bot's API. The API code starts by assigning the location of the bot's configuration files to six different variables. It then includes the various API functions that will be used. It then determines the action the user wants the API to perform, and to determine this, it checks the URL for any queries, and if it doesn't find any, then it gets the desired action from the POST request. If the action is to send a notification, then the API verifies the access key is valid, before using the "send_notification" function to send a notification to the bot's Android app, and calling the "save_conversation" function to add the notification content to the conversation file so the user can view what the bot said within the app. The API then uses multiple methods to check what platform the user is on, and whether or not they're logged in. It first checks to see if the user's browser has a cookie set, and verifies the cookie. If a cookie is set, then the platform must be the web version since the Telegram and Android versions don't have a browser to even have a cookie. There are two web versions, a temporary one, and the main one. Since the user can't use both of them at the same time, the API rejects and resets the user's browser cookies for the bot if both the temporary and main cookies are set. The API then determines which conversation file to use based on the cookie that's set. If no cookies are set, then the API checks to see if the IP address of the device trying to access it originates from Telegram. If it does, then the platform is set to Telegram, and PHP's input stream is used to retrieve messages sent to the API. The user ID is then checked to see if I'm the one messaging the bot, or if someone else is trying to access it. If the IP of the device isn't the same as Telegram's, then the only possible platform left is the Android app. I'll have to change this if I ever make an iOS version, but it'd be easy to do. Once the access keys, cookies or whatever other method of verification is used is validated, then the API allows the user to access a bunch of new functions. The first one is "check-conversation", which returns an MD5 hash of the current conversation file, which acts as a checksum. The next action is "get-conversation", which just returns the content of the current conversation file, but converts it into HTML format if the user is using the Android app. The "get-message-details" action returns information about a particular message. The "download-app" action downloads the APK file of the bot. The "send-file" action allows the user to upload files to the bot, and then returns a download link for the uploaded file. The "delete-message" action does just that, deletes a message based on its sender and the message ID. From this point on, the message sent by the user to the bot will be called "user text", and the bot's response will be called "bot text" to make things easier. The word "unparsed" will refer to text that hasn't been analyzed and modified by the API, and "parsed" will refer to the opposite. If the bot is being accessed on LAN, then the user can view the access keys of the bot. If the platform being used is the web version, then the user can use the "send-message" action to send messages to the bot, and the user text is set to the "text" field of the POST request after stripping it of any HTML characters. The "check-status" action is also used by the web version to determine if the API is responding. If the platform is set to the app version, then the "login" action becomes available, but the access key will have already been validated, so what happens is that the Firebase Cloud Messaging token is sent to the API and it is saved for future use, and a success message is returned a response, which tells the app to show the conversation screen. The app version of the bot can also be used to request the access keys of the other versions. The API then has to generate a response based on the user's message, and return said response. It starts by using the "parse_input_text" function to parse the user text. It then separates the user text based on double uses of the "&" (ampersand) character (this doesn't currently get used in anyway, but I'm planning on adding a feature that'll let the user chain commands together to save time). An array is then created containing the locations of the bot's configuration files. The "generate_response" function is then called, and the user text and bot text are saved to the appropriate conversation file. Finally, the bot outputs its response, which changes based on what platform the user is on. If it's on Telegram, then the "send_telegram_message" function is used to send the bot's response to the user. If another platform is being used, and the bot actually has a response, then a JSON array is returned which contains the message ID, the user text, and the bot text, but if the platform being used is the Android app, then the bot's response is converted into HTML format. The next 29 images show the PHP code of the bot's API functions, some of which you already saw in the API code analysis. When the user sends a message, it could contain text that would require me to give the bot multiple versions of the same phrase in order to get it to recognize each possible variation. Consequently, I decided to essentially "prepare" the user text before the bot generates a response, and remove common punctuation, and change the words "i am" to "i'm". The text is also converted to lowercase. The bot does save the original user text though, so it still keeps a copy of the text before it is parsed with this function; this helps with things such as lists, where the user might need to use punctuation or capital letters etc. This function is used after a response has been generated. Based on the platform the user is on, it modifies the response array accordingly. For example, if the user is on the web version, then the response array is turned into a string and each individual response is separated using two "<br>" tags, and any markup formatting for bold text (**bold**) is converted to HTML (<b>bold</b>). This function is used to get all the content of a URL, and then grab a specific part of the content by specifying starting and ending positions. This function just generates a token. I have blurred it out to avoid anyone trying to brute force the algorithm and get access to my bot. The help page of the bot can only be accessed if you're logged in, or you have a temporary access token. This function generates such a token. This function simply resets all the bot's access keys. This is useful I accidentally reveal a key publicly or someone steals them, as I can quickly change them. This works the same way the "reset_keys" function works, but it just resets the cookies needed to access the web version. The "validate_key" function works by comparing a key passed to it, with a key stored in the bot's configuration files. Based on the comparison, the function returns a boolean value. Same as the function above, but with cookies, except it also checks when the cookie token was generated, and revokes it if it was created more than 6 months ago for the main web version, and 10 minutes for the temporary version. Because the creation time of the cookie is stored on the server, a user can't just prolong the life of their access cookie, as the cookie itself isn't checked for this information. This function checks to see if the FCM key is valid for the Android app. This isn't the same as the FCM token, this is used to send notifications with the FCM tokens as target devices, but the API needs to verify that the user has permission to send said notifications, so that's what the FCM key is used for. This function returns all the bot's access keys. Same as above, but returns the FCM key. This returns the temporary web version access key. This function clears a given conversation file. Each message is stored in a conversation file in an associative JSON array, with the message ID as its key, and four items including "time", "user", "bot", and "followup". The "time" is just the UNIX timestamp, which is a 10 character integer. The "user" item contains the user's unparsed text, the "bot" item contains the bot's response, and the "followup" item contains a boolean value based on whether or not the bot's response allows or requires a follow-up message (for example, the user can ask "how are you?" and the bot will reply with how it's doing, and in turn, asks how the user is doing, which would qualify for a situation in which a follow-up is required). The "delete_message" function works by setting the value of the appropriate item to nothing. So if the user wants to delete a message they sent, the function would look for that message's ID in the conversation array, and then set the "user" item's value for that message to an empty string. Then, the function checks to see if removing the "user" item would result in that message array being empty, and if that's the case, it completely removes that message array from the conversation. This is done to save space and to ensure there are no instances in which a message array exists where both the "user" and "bot" items hold no value. If no messages are found, then the conversation file's content is completely deleted, otherwise, the new, updated, and modified conversation replaces the old one. This function returns information about a message, and formats its timestamp to a human-readable date. The "save_conversation" function, if the user text isn't empty, generates a message ID (while checking to see if the same ID already exists, in which case it'll generate another one), creates an array containing the timestamp, user text, bot text, and "followup" boolean, and then appends it to the conversation file, finally returning the message ID. The "generate_response" function is responsible for just that, generating a response based on the user's input. First, it gets the current conversation, and the latest message while setting the "followup" variable to false, and creating an array which will contain the bot's responses. It then includes four other PHP files, which contain the bot's various responses and trigger phrases, but I will not be showing those for security reasons, but they work with thousands of "if" statements essentially, and add responses to the response array that was created earlier. This response array is then parsed returned in another array which also contains the "followup" variable. This function just removes the bot's configuration files and any other files the user can affect and modify. This function creates the files that the previous function removes, basically creating a fresh copy of the bot. This means the user can call the "undeploy_bot" function and they could share the bot files without worrying about any personal information being passed on, or any API keys or access keys etc. and then call the "deploy_bot" function to create those essential files and generate all the required keys and such for the bot to function normally. The "send_notification" function sends a notification to the bot's Android app using Google's Firebase API along with the user's own FCM token (which is unique to every device). This function sets the "webhook" of the bot's Telegram version. The "webhook" is basically the URL where Telegram will forward the user's text to in order to get a response, so it would be the URL to the bot's API. This function sends a Telegram message from the bot. This function converts bold text from markup (**bold**) to HTML (<b>bold</b>). The "upload_file" function saves a given file to the server. This function can be used to determine whether or not a file exists at a remote external URL. This function allows the bot to control the smart bulb in my bedroom as long as the server and the light bulb are connected to the same network. This function creates a file called "log.txt" in the current directory, and puts whatever content is passed as an argument in the file. This function generates a message ID by combining a UNIX timestamp with a hyphen and a randomly generated 9 character integer. The next 4 images show the PHP code for the bot's web version functionality. Just like the API, the configuration files are assigned to variables, and the bot functions are included, and the desired action is determined. The "login" action works by using the "validate_key" function I explained earlier, and then setting a cookie on the user's browser based on which platform they're using. The "logout" action simply removes any cookies related to accessing the bot from the user's browser. The "show-help" action first verifies if the token the user has provided is correct and hasn't expired, and then includes the help page. If the user is already logged in, then they would be at the main page of the bot, which is the conversation screen. This page makes a number of POST requests to this web-version-exclusive PHP script. The "get-config" action returns the content of the bot's settings file. The "save-config" action saves any settings that are sent via a POST request, but it checks to make sure the content doesn't include any JavaScript, PHP, or HTML code as only CSS would be sent anyway (so the user can change the bot's colors and such). The "reset-keys" action resets the bot's access keys using the "reset_keys" function. The "check-update" action returns a timestamp of the last time the bot's JavaScript file was modified. This is used by the web version to determine whether or not the page needs to be refreshed to fetch the new and updated JS file. The bot's web version UI is shown in the images below. The web version of the bot looks pretty much identical to the app version, but the settings page is different, because it allows the user to change the page's color scheme. There are two fields the user can modify, one that contains the CSS code for the page's main colors (used for the backgrounds, text etc.) and another that contains CSS for the accent color, which in this case is purple. Changing the code in these fields updates the page instantly, allowing the user to change the colors and view the modifications live without refreshing the page or anything like that. The web version also has a light mode as seen above. To make debugging easier, I've also developed an "analysis" mode for the bot's web version. The analysis mode takes the content of the conversation file, and structures it into a sort of chronological flowchart that shows the message ID, the user's input text, the user's intention based on their input, the bot's response, and (if applicable) any context that was required for the response to be generated. Thank you for reading about this project, it's my favorite one. I have a huge interest in AI, and figured I knew how to make a pretty simple one myself, which (at least functionally) works just like Siri, Google Assistant, Bixby etc. that don't use machine learning when it comes to the responses they give, and the services they provide. Unfortunately, I don't have the resources to add any form of natural language processing to Serenyx yet, but I have played around with this idea before, and I could see it perhaps getting a lot better in the future. My goal was to create a bot that could automate a lot of the things that I do daily and save time, so my bot is really a tool for productivity. I'm constantly working on it and adding new features, so this analysis will never be truly up-to-date. There are some features I'd love to add that I haven't as of July 2019, including the ability to request an Uber or food using Uber Eats via Uber's API and geolocation functionality in the Android app. I'm also looking into rewriting some of the bot's code to make it more object-oriented as I believe it'd make the code much easier to deal with, and it'd result in the bot being more efficient. Unfortunately, for security reasons and such, I won't be making my bot fully open-source, and won't be publicly posting how its responses are generated.