Data Parsing? Awk? Excel? Power-BI?

nicolewells · ‎09-25-2018

I don't really know how to ask this question, I just know there's got to be a better way and I'm looking for any sort of nudge in the right direction. I have a custom report I've built into a Frankenstein monster. I'm trying to find better tools to use to make the process more stable, make the report easier for others to run, and also to better separate the data from the processing.

NOT SAFE FOR SANE WORKFLOWS - YOU HAVE BEEN WARNED

Data Sources

contents of flat text files on linux systems

Names and modify times of files within a directory

mysql DB

CSV file

Data Collection, so far

Greping text files, piping output to sed so I end up with one record per line. (I run those commands via ansible platform, so I can run the command once and get the results from the 5-10 children in a single list instead of having to run once per child.)

ls plus grep plus cut and sed (I know, right?) (Also via ansible)

script to run SQL query

Data Parsing, so far

Some manual normalization of SQL output, to make it tab separated

Excel spreadsheet with MID, LEN, LEFT functions that can split up the sed output into columns

Pivot table to generate list of unique IPs (that breaks every time file is moved to a different location. Thanks Microsoft)

Vlookups of those IPs, pulling data in from the various source to make a master table (that isn't capable of displaying two values per lookup, so if I build a separate step to list out all duplicate entries so I can deal with them)

Excel sorts and filters to carve up master table into groups of things

Goals

Separate data from parsing, make a system where I can easily load in a new data set, or easily update to a new 'parsing set' with an older data set.

automate the filter and split part

Less likely to break

Simpler process, so I don't have to be involved to make the report

More elegant

Ideas

Use awk to do additional processing on the linux CLI side? https://unix.stackexchange.com/questions/88550/vlookup-function-in-unix

I have some co-workers who use Power-BI, not sure if that would be a good fit

Some other free / cheap alternative?

Thank you for your time and input!

Stachu · ‎09-25-2018

from what I've read PowerBI should be able to do what you need

you will still need to provide updated location in case the file move - M can read all files from folder, so you could set up a flexible logic, but it will need specific input for references to work

FYI - from the data transformaiton perspective it can all be achieved in the same way in Excel by using Power Query (Get & Transform in Excel 2016), it's more how you set up the process than the tool itself. In a sense it's even more flexible cause you can use VBA to read the current file location and seek latest version of data source in the same folder

Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

Data Parsing? Awk? Excel? Power-BI?

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly