portcaster.blogg.se

Grep count
Grep count







grep count

Minute_timestamp = datetime.strptime(entry, '%b %d %H:%M') In fact, this looked a little different, because I was using all the Rust-powered substitutes: $ fd ".I just whipped this up really quickly: #!/usr/bin/env python3 The same goes for rg -null and its rg -0 shorthand. ↩︎Īnd here I’m usually using cw, yet another very fast Rust implementation of a utility. Here I’m usually using ripgrep, which is also powered by Rust and is ridiculous fast, but again: the same constraints apply. Or the lovely and very fast Rust-powered alternative, fd - but the exact same set of challenges in the rest of this post apply whichever you’re using.

  • tr exists and is great for simple character substitution throughout a stream of text.
  • sed doesn’t (easily) work with newlines \n.
  • grep treats streams differently than files.
  • Grep "notes/2020" |\ # filter them tr '\n' '\0' |\ # replace newline with null The final workflow looks like this (separated onto multiple lines so it’s easier to follow): 5 $ find -name ".md" notes |\ # find the files While it can do substantially more sophisticated transformations than this, too, it’s perfect for this simple text replacement: tr '\n' '\0' substitutes the null character \0 for every newline in the input stream - and then xargs -0 will do what we need. The tr utility copies the standard input to the standard output with substitution or deletion of selected characters. (Credit to this Stack Overflow question for teaching me both of these things!) The tr man page’s description: Much easier is to use tr, a utility I had never heard of before today, which is used to translate characters. My first thought was to use sed, but sed works on lines, using \n as its separator, so you have to do shenanigans to get it to work.

    #Grep count how to

    Then the question was how to substitute the null character \0 for each of the newlines \n in that stream. The problem was that there were no \0 characters in the stream going into wc, but I was invoking it as xargs -0 wc -c, so it was trying to treat the list of all the matching files as a single argument… which, at over 12,000 characters long, far exceeded the operating system’s limits for file paths (on any file system in common use today).Īfter thinking about this for a few, I realized I needed to treat grep output as a plain text stream, rather than a list of files. No matter what I did, I kept seeing the error: Meanwhile, grep -null separates file results with \0, but does not separate results from within a stream of text - which is what grep sees when we pipe the results of find into it.

    grep count

    Since I’m not working with the output from find directly, its -print0 expression isn’t useful: those results will be piped into grep, which prints each result on a line. 4 Unfortunately, here I’m combining them. If I were just using grep, I would use its -null flag. If I were just using find, I would use its -print0 flag. Normally, I solve this kind of thing using xargs -0, which uses the null \0 character as the separator for arguments to the function you invoke. If you hand any standard Unix utility a bunch of files where any of them have spaces in their names, you’ll see reports that various files don’t exist - where the “file” named is just one part of an actual file name. The first problem is that wc, like most Unix commands, assumes that its arguments are space-delimited. My first instinct (and possibly yours if you’re reading this via a web search!) is to use find 1 to get the first set of files, do the further filtering with grep, 2 and finally use xargs to pipe the results into wc -w. I sometimes want to use find subset of files matching a pattern, further filter it with grep, and then do something with the results using xargs - most often, something like counting the words in the subset I found. wc to count words), use tr to substitute the null character for newlines: $ find notes -name ".md" |\ Summary: If you want to find files, filter them on file name, and pipe the result into some other Unix command (e.g. See the follow-up post, in which I show an easier and faster way of doing this… as long as you have the GNU versions of the utilities, or alternatives like ripgrep. Well, I know this works, but I wouldn’t be surprised if someone told me an even better way to implement it. 90% myself in the future, when I (inevitably) ask this question again-but also anyone else who hits this particular question about command-line invocations.









    Grep count