Capture: the initial search for mytext
Purpose is to do an initial search on many files, generate a temporary file, to be used for the Cleanup. At this stage your search is deliberately general, because you need to see the pattern of your data on the lines. Most of the data you search is not consistent. ie search mytext in every subdirectory, and ignore the case.
Goto the top directory level of the files using cd as shown here Search Info just in this Directory or also in all subdirectories? TIP on mytext search: of the words/mytext to be searched, always chose text is more unusual ie old yellow car "yellow" Also select some lines above & below (near) so I can see if this is the right file/context upper & lower case search, or know the exact mytext to be searched ??? text is exact myword, like a known code and not part of a ? mytext or mysecondtext or mythirdtext search a list of mytexts stored in the file mylist mytext is always of a certain/2/second length mytext begins/ends with 2 characters A-Z mytext begins/ends with 2/second numbers mytext begins with range of 2/second characters A or B or C or D before/after a range of numbers certain filenames or certain file extensions also show filenames & directories in the results? After viewing the initial results, definitely delete lines with mytext in it are there pdfs, spreadsheet,powerpoint, word docs Capture: Refined search Purpose: you have collected all the raw information, and it still has some unwanted lines - now reduce the information to its essential lines. You read in the above temporary file and then chain together more commands that refine the search in certain section of lines: ie only in section of lines after mytext - ie delete everything up a certain text or select everything after a certain time if its a log only in section of lines above mytext, or lines between mytext and mysecondtext or lines after mytext Location of text: mytext is at the beginning of line, between mytext & mysecondtext, end of line only if mytext is in a particular / second column, begin column, end column, or second from end column mytext has a particular pattern: ie 1st character = lowercase, or 2 numbers , or any one of a few words somewhere on the line, a code of 3 letters followed by 5 numbers mytext has certain fixed length of say 10 characters the text I'm looking for is an exact word. ie it isnt text which can be part of a word. ( mytext like 'child' will pick out childless, unchildish, myword like 'child' will only select "child" & not pick out 'childish' "unchildlike" etc ) mytext is inside a whole paragraph, select paragraph mytext is a number: is greater or smaller than a number say 2011 or between 2 and 200, equals or not equals to 2 number is in a certain/second column, is greater or less than 2 , equals or not equals to 2,say 2011 mytext always begins / ends with a range of 2 numbers or more nummbers myword has 2 numbers , ie like 5 numbers in the US zip code multiple texts: myText aswell as mySecondText both on same line mytext is followed my mysecondtext is followed by mythirdtext: myThirdtext is after mysecondtext, is after mytext on the line is length of a column smaller or greater than 2 characters |