Creating a chatbot to assist in network operations [Tutorial]

11 min read

In this tutorial, we will understand how to leverage chatbots to assist in network operations. As we move toward intelligent operations, another area to focus on is mobility. It’s good to have a script to perform configurations, remediations, or even troubleshooting, but it still requires a presence to monitor, initiate, or even execute those programs or scripts.

Nokia’s MIKA is a good example of a chatbot that operations personnel can use for network troubleshooting and repair. According to Nokia’s blog,  MIKA responds with an alarm prioritization information based on the realities for this individual network and also compare’s the current situation to a whole service history of past events from this network and others, in order to identify the best solution for the current problem.

Let’s create a chatbot to assist in network operations. For this use case, we will use a widely-used chat application, Slack. Referring to the intelligent data analysis capabilities of Splunk, we would see some user chat interaction with the chatbot, to get some insight into the environment.

This tutorial is an excerpt from a book written by Abhishek Ratan titled Practical Network Automation – Second Edition. This book will acquaint you with the fundamental concepts of network automation and help you improve your data center’s robustness and security.

The code for this tutorial can be found on GitHub.

As we have our web framework deployed, we’ll leverage the same framework to interact with the Slack chatbot, which in turn will interact with Splunk. It can also interact directly with network devices so we can initiate some complex chats, such as rebooting a router from Slack if need be. This eventually gives mobility to an engineer who can work on tasks from anywhere (even from a cellphone) without being tied to a certain location or office.

To create a chatbot, here are the basic steps:

  1. Create a workspace (or account) on Slack:

  1. Create an application in your workspace (in our case, we have created an app called mybot):

  1. Here is the basic information about the application (App ID and Client ID can be used along with other information that uniquely identifies this application):

  1. Add a bot capability to this application:

  1. Add the event subscriptions and mapping to the external API that the messages will be posted to. An event subscription is when someone types the reference to the chatbot on the chat, then which API will be called with the data that is being typed in the chat with this chatbot:

Here, a crucial step is once we type in the URL that accepts chat messages, that particular URL needs to be verified from Slack. A verification involves the API endpoint sending the same response back as a string or JSON that is being sent to that endpoint from Slack. If we receive the same response, Slack confirms that the endpoint is authentic and marks it as verified. This is a one-time process and any changes in the API URL will result in repeating this step.

Here is the Python code in the Ops API framework that responds to this specific query:

import falcon
import json
def on_get(self,req,resp):
 # Handles GET request
 resp.status=falcon.HTTP_200 # Default status
 resp.body=json.dumps({"Server is Up!"})
def on_post(self,req,resp):
 # Handles POST Request
 print("In post")
 data=req.bounded_stream.read()
 try:
 # Authenticating end point to Slack
 data=json.loads(data)["challenge"]
 # Default status
 resp.status=falcon.HTTP_200
 # Send challenge string back as response
 resp.body=data
 except:
 # URL already verified
 resp.status=falcon.HTTP_200
 resp.body=""

This would validate, and if a challenge is sent from Slack, it would respond back with the same challenge value that confirms it to be the right endpoint for the Slack channel to send chat data to.

  1. Install this application (or chatbot) into any channels (this is similar to adding a user in a group chat):

The core API framework code that responds to specific chat messages, performs the following actions:

  • Acknowledges any post sent to Slack with a response of 200 in three seconds. If this is not done, Slack reports back: endpoint not reachable.
  • Ensures any message sent from chatbot (not from any real user) is again not sent back as a reply. This can create a loop, since a message sent from a chatbot, would be treated as a new message in Slack chat and it would be sent again to URL. This would eventually make the chat unusable, causing repetitive messages on the chat.
  • Authenticates the response with a token that would be sent back to Slack to ensure the response coming to Slack is from an authenticated source.

The code is as follows:

import falcon
import json
import requests
import base64
from splunkquery import run
from splunk_alexa import alexa
from channel import channel_connect,set_data
class Bot_BECJ82A3V():
    def on_get(self,req,resp):
        # Handles GET request
        resp.status=falcon.HTTP_200 # Default status
        resp.body=json.dumps({"Server is Up!"})
    def on_post(self,req,resp):
        # Handles POST Request
        print("In post")
        data=req.bounded_stream.read()
        try:
            bot_id=json.loads(data)["event"]["bot_id"]
            if bot_id=="BECJ82A3V":
                print("Ignore message from same bot")
                resp.status=falcon.HTTP_200
                resp.body=""
                return
        except:
            print("Life goes on. . .")
        try:
            # Authenticating end point to Slack
            data=json.loads(data)["challenge"]
            # Default status
            resp.status=falcon.HTTP_200
            # Send challenge string back as response
            resp.body=data
        except:
            # URL already verified
            resp.status=falcon.HTTP_200
            resp.body=""
        print(data)
        data=json.loads(data)
        #Get the channel and data information
        channel=data["event"]["channel"]
        text=data["event"]["text"]
        # Authenticate Agent to access Slack endpoint
        token="xoxp-xxxxxx"
        # Set parameters
        print(type(data))
        print(text)
        set_data(channel,token,resp)
        # Process request and connect to slack channel
        channel_connect(text)
        return
# falcon.API instance , callable from gunicorn
app= falcon.API()
# instantiate helloWorld class
Bot3V=Bot_BECJ82A3V()
# map URL to helloWorld class
app.add_route("/slack",Bot3V)

Performing a channel interaction response: This code takes care of interpreting specific chats that are performed with chat-bot, in the chat channel. Additionally, this would respond with the reply, to the specific user or channel ID and with authentication token to the Slack API https://slack.com/api/chat.postMessage. This ensures the message or reply back to the Slack chat is shown on the specific channel, from where it originated. As a sample, we would use the chat to encrypt or decrypt a specific value.

For example, if we write encrypt username[:]password, it would return an encrypted string with a base64 value.

Similarly, if we write decrypt , the chatbot would return a after decrypting the encoded string.

The code is as follows:

import json
import requests
import base64
from splunk_alexa import alexa
channl=""
token=""
resp=""
def set_data(Channel,Token,Response):
    global channl,token,resp
    channl=Channel
    token=Token
    resp=Response
def send_data(text):
global channl,token,res
print(channl)
resp = requests.post("https://slack.com/api/chat.postMessage",data='{"channel":"'+channl+'","text":"'+text+'"}',headers={"Content-type": "application/json","Authorization": "Bearer "+token},verify=False)

def channel_connect(text):
global channl,token,resp
try: 
print(text)
arg=text.split(' ')
print(str(arg))
path=arg[0].lower()
print(path in ["decode","encode"])
if path in ["decode","encode"]:
print("deecode api")
else:
result=alexa(arg,resp)
text=""
try:
for i in result:
print(i)
print(str(i.values()))
for j in i.values():
print(j)
text=text+' '+j
#print(j)
if text=="" or text==None:
text="None"
send_data(text)
return
except:
text="None"
send_data(text)
return
decode=arg[1]
except:
print("Please enter a string to decode")
text=" argument cannot be empty"
send_data(text)
return
deencode(arg,text)

def deencode(arg,text):
global channl,token,resp
decode=arg[1]
if arg[1]=='--help':
#print("Sinput")
text="encode/decode "
send_data(text)
return
if arg[0].lower()=="encode":
encoded=base64.b64encode(str.encode(decode))
if '[:]' in decode:
text="Encoded string: "+encoded.decode('utf-8')
send_data(text)
return
else:
text="sample string format username[:]password"
send_data(text)
return
try:
creds=base64.b64decode(decode)
creds=creds.decode("utf-8")
except:
print("problem while decoding String")
text="Error decoding the string. Check your encoded string."
send_data(text)
return
if '[:]' in str(creds):
print("[:] substring exists in the decoded base64 credentials")
# split based on the first match of "[:]"
credentials = str(creds).split('[:]',1)
username = str(credentials[0])
password = str(credentials[1])
status = 'success'
else:
text="encoded string is not in standard format, use username[:]password"
send_data(text)
print("the encoded base64 is not in standard format username[:]password")
username = "Invalid"
password = "Invalid"
status = 'failed'
temp_dict = {}
temp_dict['output'] = {'username':username,'password':password}
temp_dict['status'] = status
temp_dict['identifier'] = ""
temp_dict['type'] = ""
#result.append(temp_dict)
print(temp_dict)
text=" "+username+"  "+password
send_data(text)
print(resp.text)
print(resp.status_code)
return

This code queries the Splunk instance for a particular chat with the chatbot. The chat would ask for any management interface (Loopback45) that is currently down. Additionally, in the chat, a user can ask for all routers on which the management interface is up. This English response is converted into a Splunk query and, based upon the response from Splunk, it returns the status to the Slack chat.

Let us see the code that performs the action to respond the result, to Slack chat:

from splunkquery import run
def alexa(data,resp):
    try:
        string=data.split(' ')
    except:
        string=data
    search=' '.join(string[0:-1])
    param=string[-1]
    print("param"+param)
    match_dict={0:"routers management interface",1:"routers management loopback"}
    for no in range(2):
        print(match_dict[no].split(' '))
        print(search.split(' '))
        test=list(map(lambda x:x in search.split(' '),match_dict[no].split(' ')))
        print(test)
        print(no)
        if False in test:
            pass
        else:
            if no in [0,1]:
                if param.lower()=="up":
                    query="search%20index%3D%22main%22%20earliest%3D0%20%7C%20dedup%20interface_name%2Crouter_name%20%7C%20where%20interface_name%3D%22Loopback45%22%20%20and%20interface_status%3D%22up%22%20%7C%20table%20router_name"
                elif param.lower()=="down":
                    query="search%20index%3D%22main%22%20earliest%3D0%20%7C%20dedup%20interface_name%2Crouter_name%20%7C%20where%20interface_name%3D%22Loopback45%22%20%20and%20interface_status%21%3D%22up%22%20%7C%20table%20router_name"
                else:
                    return "None"
                result=run(query,resp)
                return result

The following Splunk query fetches the status:

  • For UP interface: The query would be as follows:
index="main" earliest=0 | dedup interface_name,router_name | where interface_name="Loopback45" and interface_status="up" | table router_name
  • For DOWN interface (any status except ): The query would be as follows:
index="main" earliest=0 | dedup interface_name,router_name | where interface_name="Loopback45" and interface_status!="up" | table router_name

Let’s see the end result of chatting with the chatbot and the responses being sent back based on the chats.

The encoding/decoding example is as follows:

As we can see here, we sent a chat with the encode abhishek[:]password123 message. This chat was sent as a POST request to the API, which in turn encrypted it to base64 and responded back with the added words as Encoded string: . In the next chat, we passed the same string with the decode option. This responds back with decoding the information from API function, and responds back to Slack chat, with username abhishek and password password123.

Let’s see the example of the Splunk query chat:

In this query, we have shut down the Loopback45 interface on rtr1. During our scheduled discovery of those interfaces through the Python script, the data is now in Splunk. When queried on which management interface (Loopback45) is down, it would respond back with rtr1. The slack chat, On which routers the management interface is down, would pass this to the API, which, upon receiving this payload, will run the Splunk query to get the stats. The return value (which, in this case, is rtr1) will be given back as a response in the chat.

Similarly, a reverse query of, On which routers the management interface is up, will query Splunk and eventually share back the response as rtr2, rtr3, and rtr4 (as interfaces on all these routers are UP).

This chat use case can be extended to ensure that full end-to-end troubleshooting can occur using a simple chat. Extensive cases can be built using various backend functions, starting from a basic identification of problems to complex tasks, such as remediation based upon identified situations.

Summary

In this tutorial, we implemented some real-life use cases and looked at techniques to perform troubleshooting using chatbot. The use cases gave us insight into performing intelligent remediation as well as performing audits at scale, which are key challenges in the current environment.

To learn how to automate your own network without any hassle while leveraging the power of Python, check out our book Practical Network Automation – Second Edition

Read Next

Preparing and automating a task in Python [Tutorial]

PyPy 7.0 released for Python 2.7, 3.5, and 3.6 alpha

5 blog posts that could make you a better Python programmer

Share this post

Popular

12,000+ unsecured MongoDB databases deleted by Unistellar attackers

Over the last three weeks, more than 12,000 unsecured MongoDB databases have been deleted. The cyber-extortionist have left only an email contact, most likely...