这篇是<OBIEE Usage Tracking增强--使用Python深入分析OBIEE访问日志>的的一部分,前一篇见
OBIEE Usage Tracking增强--使用Python深入分析OBIEE访问日志--总的想法
这篇介绍如何统计OBIEE记录在逻辑日志中访问字段,
具体步骤如下:
从逻辑日志表读取sql语句
先忽略如何从数据库读实现,将数据导出到文本文件即可,读取sql的语句如下:
select t.query_text from s_nq_acct t where t.query_src_cd='Report'
解析出每个字段
使用正则表达式匹配就可以找出所有字段
search_patten ='\"[a-zA-Z0-9_ ]*\".\"[a-zA-Z0-9_ ]*\"'
re.findall(search_patten,sqltext)
计数
使用dict存储数据,注意dict访问未定义数据会抛出异常,overide __missing__可以绕过这件事。
示例和结果:
示例使用的sql
SELECT "Organization"."StoreName" saw_0, "Organization"."StoreID" saw_1, REPORT_AGGREGATE(saw_1 BY ) FROM KPI WHERE ("Organization"."StoreID" = 'aaaaaaa') ORDER BY saw_0, saw_1
SELECT "Organization"."StoreName" saw_0, "Organization"."StoreID" saw_1, REP ORT_AGGREGATE(saw_1 BY ) FROM KPI WHERE ("Organization"."StoreID" = 'bbbb') ORDER BY saw_0, saw_1
输出结果
{'findcolumn': 1, '"Organization"."StoreName"': 2, '"Organization"."StoreID"': 4}
示例代码:
import csv
import re
class Counter(dict):
def __missing__(self, key):
return 0
def parse_column(sqltext):
search_patten ='\"[a-zA-Z0-9_ ]*\".\"[a-zA-Z0-9_ ]*\"'
return re.findall(search_patten,sqltext)
sqlcolumn_cnt = Counter()
sqltable_cnt = Counter()
reader = open('logicalsql_sample.txt', 'r')
currentline=0
for line in reader:
sqltext = line
currentline = currentline + 1
print('----parse sql:',currentline)
findcolumns = parse_column(sqltext)
for findcolumn in findcolumns:
#print(findcolumn)
sqlcolumn_cnt[findcolumn] += 1
sqltable_cnt[findcolumn[0:findcolumn.find('.') ]] += 1
reader.close()
print(sqlcolumn_cnt)