M.U.P.P.I.X. purveyors of fine Data Analysis Tools
  • Home
    • Applications
    • Blog
    • About
    • Clients
    • Company
    • Other Links
  • Training
  • Get Started
    • Muppix Keywords
    • Glossary find Keywords
    • Templates >
      • Capture
      • Explore
      • Clean-up
    • Approach to BigData
  • Linux Cheatsheet
    • Linux Cheatsheet 2
    • Essential Terminal Commands
    • Basic Linux Commands
  • SQL & Excel Commands
    • SQL Cookbook
    • SQL Cookbook 2
    • SQL search entire DataBase
    • SQL Import Table Tool
    • Excel OneLiners
  • Download

___________Muppix examples usage , projects
delete duplicate files:  photos , mp3, films, etc, so can safely copy client docs accross & finds only new ones to be copied , based on exact file signature, just filename, files with same size, delete on selected files from the found list
extract emails or phone numbers from mix of unstructured documents. ie all docs & correspondence for Some-Client Inc in a directory & extract & sort contact details

extract data or table from  website   ie financial prices from investment websites
extract websites using key macros
find files. downloaded something in last 5 minutes, I dont know its name, but it's somewhere on my computer
biggest directories & subdirectories ie pictures/photos & films , or harddrive is getting full & look at which large directory can be deleted
extract tables of data from loose text/reports & view in Spreadsheet.
extract data from any software screens & create file, using programmable keystroke macro & simple Unix server
extract table from official pdf or documents . Extract multiple tables & convert into spreadsheet
get contact phone numbers & addresses
find latest directories added . ie added new audio-story-books, but cant remember if stored by author, titles, etc.
compare directories
compare directories to see what has changed in last week  
find biggest directories
most recent directories

find amounts greater than a certain number
trim large files - delete multiple spaces in the lines/ blank lines
left align    delete space begin
dont want - get rid of rows that have FRED in the 7th field , search for
delete column second
add some blanks beginning of the text - insert space begin
remove lines 40 - 50 - delete fixed  lines second
remove lines have FRED in twice -  delete mytext second

sort by 3rd & 7th column     sort column second
sort by length of each row    sort length

unique product codes we traded in last month   - single column mytext  (first select files using "last" )


standardise the dates & timestamp so I can glue together the various log files , sort it and findout exactly what happened in the various servers during that specified time

how many of each product did we sell in the last quarter - occurrence (select files using "last" )
format numbers or dates

get rid of files starting with oldProject , older than 6 months - delete files

save as spreadsheet / access db

Read hourly temperatures off government website for every area, determine when extended periods of cold weather, to calculate what size geothermal unit to install

clear out copies of files on yr hard drive, & determine in which order to delete

find post code
find phone number
find ip address
find ISBN code

how many products sold in last week:
find .  -mtime    logs last 7 days  in 1 file
select lines have word 'bought'
ignore lines with 'cancelled'
extract list of products, product codes start with XYZ JJJ ABC UUU from logs
calculate how many of each product sold

compare 2 spreadsheets for differences
save each to csv file mySpreadsheet1.csv & mySpreadsheet2.csv, copy to mytest
diff   mySpreadsheet1.csv & mySpreadsheet2.csv

or use emacs Tools, Compare 2 buffers

read in bank account statements in to spreadsheet
website:     run macro , VBA macro to populate

wine compare
solar knmi hourly temperature

sales database CRM / emails

delete duplicate files,photos,music/films, giving priority to certain directories. based on certain, match file on name, creation date, size, or name & size , or only certain files, ie only music or films or Spreadsheets
delete selective range of files

Market Research : KNMI solar analysis, hourly tempratures   with Muppix, more data is better. Size is not a problem

Muppix provides innovative solutions and Training to make sense of large scale data.
Backed by years of industry experience, the Muppix Team have developed a Free Data Science Toolkit to extract and analyse multi-structured information from diverse data sources


Company

Blog

Training

Professional Services

Get Started