Ngrepping the Vertica network protocol

2 minute read Published:

How I used ngrep to reverse-engineer the Vertica wire protocol from JDBC and contribute it upstream to Python and Ruby-native Vertica libraries

I’ve been sitting on this draft article for a long time and it’s good information so I’ll make it short and sweet.

When playing with Vertica, and using the Uber vertica-python library to write a cli tool, I noticed that the connections originating from the library were not populating any client information:

dbadmin=> select * from sessions;
     node_name     | user_name | client_hostname | client_pid |        login_timestamp        |      session_id      | client_label |       transaction_start       |  transaction_id   |        transaction_description         |        statement_start        | statement_id | last_statement_duration_us | runtime_priority |    current_statement    |     last_statement      | ssl_state | authentication_method | client_type |
     client_version |              client_os              | client_authentication_name | client_authentication | jvm_memory_kb 
     -------------------+-----------+-----------------+------------+-------------------------------+----------------------+--------------+-------------------------------+-------------------+----------------------------------------+-------------------------------+--------------+----------------------------+------------------+-------------------------+-------------------------+-----------+-----------------------+-------------+----------------+-------------------------------------+----------------------------+-----------------------+---------------
      v_docker_node0001 | dbadmin   | ::1:42970       |        265 | 2017-03-09 04:44:21.700761+00 | 96b3096119f9-69:0x2b |              | 2017-03-09 04:44:25.657386+00 | 45035996273705081 | user dbadmin (select * from sessions;) | 2017-03-09 04:44:46.585475+00 |            4 |                       5365 |                  | select * from sessions; | select * from sessions; | None      | ImpTrust              | vsql        | 07.01.0100     | Linux
      4.9.13-200.fc25.x86_64 x86_64 | default: Implicit Trust    |                     0 |             0
      (1 row)

The JDBC driver on the other hand gave some good information. At a glance, when running select * from sessions, I couldn’t tell that it was my Python tool which was spawning the connections.

Using ngrep I was able to find what the JDBC wire protocol looked like (using the sumitchawla/vertica Vertica Docker container for convenience):

sevag@sevag-HP-ProBook-6470b:~/repos/scratch-vsql$ sudo ngrep -d any port 5433                                                                
interface: any
filter: (ip or ip6) and ( port 5433 )


##########
T 127.0.0.1:52882 -> 127.0.0.1:5433 [AP]
  ........user.dbadmin.database.docker.client_encoding.UNICODE.DateS
  tyle.ISO.client_pid.10370.client_label.jdbc-07.01.0200-ceadc093-99
  97-442a-877f-293c8c9f38bd.client_type.JDBC Driver.client_version.0
  7.01.0200.client_os.Linux 4.8.0-22-generic amd64..                
##
T 172.17.0.1:53756 -> 172.17.0.2:5433 [AP]
  ........user.dbadmin.database.docker.client_encoding.UNICODE.DateS
  tyle.ISO.client_pid.10370.client_label.jdbc-07.01.0200-ceadc093-99
  97-442a-877f-293c8c9f38bd.client_type.JDBC Driver.client_version.0
  7.01.0200.client_os.Linux 4.8.0-22-generic amd64..                
#
T 172.17.0.1:53756 -> 172.17.0.2:5433 [AP]
  ........user.dbadmin.database.docker.client_encoding.UNICODE.DateS
  tyle.ISO.client_pid.10370.client_label.jdbc-07.01.0200-ceadc093-99
  97-442a-877f-293c8c9f38bd.client_type.JDBC Driver.client_version.0
  7.01.0200.client_os.Linux 4.8.0-22-generic amd64..                
###
T 172.17.0.2:5433 -> 172.17.0.1:53756 [AP]
  R........S....server_version.v7.1.1-0.S....long_string_types.on.S.
  ..#standard_conforming_strings.on.K.......`\...S...)client_locale.
  en_US@collation=binary.Z....I                                     
#

Finally, I made pull requests for the aforementioned vertica-python, and it’s predecessor, vertica-ruby:

Result:

dbadmin=> select client_os, client_version ,client_type, client_hostname, client_label, client_pid from sessions;
           client_os           | client_version | client_type | client_hostname  |             client_label              | client_pid 
-------------------------------+----------------+-------------+------------------+---------------------------------------+------------
 x86_64-linux                  | 1.0.1          | vertica-rb  | 172.17.0.1:38516 | vertica-rb-1.0.1-11ba0f9f74d437d53374 |       1772

Result (intermediate result, xxx were just placeholder strings early in my PR):

docker => select client_hostname, client_pid, client_label, client_type, client_version, client_os, client_authentication_name, client_authentication from sessions;
-[ RECORD 2 ]--------------+--------------------------------------------------------
client_hostname            | 172.17.0.1:58870
client_pid                 | 15578
client_label               | vertica-python-xxx
client_type                | vertica-python-xxx
client_version             | vertica-python-xxx
client_os                  | Linux-4.8.0-22-generic-x86_64-with-Ubuntu-16.10-yakkety
client_authentication_name | default: Implicit Trust
client_authentication      | 0