Friday , October 30 2020

Dgsh – Directed Graph Shell, Hacker News


          

The directed graph shell,dgsh(pronounced/ dæɡʃ / –dagsh), provides an expressive way to construct sophisticated and efficient big data set and stream processing pipelines using existing Unix tools as well as custom-built components. It is a Unix-style shell (based onbash) allowing the specification of pipelines with non-linear non-uniform operations. These form a directed acyclic process graph, which is typically executed by multiple processor cores, thus increasing the operation’s processing throughput.

If you want to get a feeling on howdgshworks in practice, skip right down to theexamplessection.

For a more formal introduction todgshor to cite it in your work, see:
Diomidis Spinellis and Marios Fragkoulis.Extending Unix Pipelines to DAGs.IEEE Transactions on Computers, 2017. DOI: 10. 1109 /TC.2017. 2695447

Inter-process communication

DGSHprovides two new ways for expressing inter-process communication.

****************************************** (Multipipes)

are expressed as usual Unix pipelines, but can connect commands with more than one output or input channel. As an example, thecommcommand supplied withdgshexpects two input channels and produces on its output three output channels: the lines appearing only in first (sorted) channel, the lines appearing only in the second channel, and the lines appearing in both. Connecting the output of thecommcommand to thecatcommand supplied withdgshwill make the three outputs appear in sequence, while connecting it to thepastecommand supplied withdgshwill make the output appear in its customary format.
Multipipe blocks { {…}}
a) send (multiple) input streams received on their input side to the asynchronously-running processes that reside within the block, and, b) pass the output produced by the processes within the block as (multiple) streams on their output side. Multipipe blocks typically receive input from more than one channel and produce more than one output channel. For example, a multipipe block that runsmd5sumandWC-Creceives two inputs and produces two outputs: the MD5 hash of its input and the input’s size. Data to multipipe blocks are typically provided with adgsh– aware version of (tee) and collected bydgsh– aware versions of programs such ascatandpaste.
Stored values ​​
offer a convenient way for communicating computed values ​​between arbitrary processes on the graph. They allow the storage of a data stream’s last record into a named buffer. This record can be later retrieved asynchronously by one or more readers. Data in a stored value can be piped into a process or out of it, or it can be read using the shell’s command output substitution syntax. Stored values ​​are implemented internally through Unix-domain sockets, a background-running store program,dgsh-writeval, and a reader program,dgsh-readval. The behavior of a stored value’s IO can be modified by adding flags todgsh-writevalanddgsh-readval.

(Syntax)

Adgshscript follows the syntax of abash(1) shell script with the addition ofmultipipeblocks. A multipipe block contains one or moredgshsimple commands, other multipipe blocks, or pipelines of the previous two types of commands. The commands in a multipipe block are executed asynchronously (in parallel, in the background). Data may be redirected or piped into and out of a multipipe block. With multipipe blocksdgshscripts form directed acyclic process graphs. It follows from the above description that multipipe blocks can be recursively composed.

As a simple example consider running the following command directly withindgsh

{{echo hello & echo world &}} | paste

or by invokingDGSHwith the command as an argument.

dgsh -c '{{echo hello & echo world &}} | paste '

The command will runpastewith input from the twoechoprocesses to output (hello world) . This is equivalent to running the followingbashcommand, but with the flow of data appearing in the natural left-to-right order.

paste

In the following larger example, which compares the performance of different compression utilities, the script's standard input is distributed to three compression utilities (xz,bzip2, andgzip), to assess their performance, and also tofileandwcto report the input data's type and size. Theprintfcommands label the data of each processing type. All eight commands pass their output to the (cat) command, which gathers their outputs in order.

tee | {{ printf 'File type:  t' file -  printf 'Original size:  t' wc -c  printf 'xz:  t  t' xz -c | wc -c  printf 'bzip2:  t  t' bzip2 -c | wc -c  printf 'gzip:  t  t' gzip -c | wc -c }} | cat

Formally,dgshextends the syntax of the (modified) Unix Bourne-shell when (bash) provided with the- dgshargument as follows.

::='{{''}}'::='&'                  ::=                                  '|'

Adapted tools

A number of Unix tools have been adapted to support multiple inputs and outputs to match their natural capabilities. This echoes a similar adaptation that was performed in the early 1970 s when Unix and the shell got pipes and the pipeline syntax. Many programs that worked with files were adjusted to work as filters. The number of input and output channels ofdgsh– compatible commands are as follows, based on the supplied command-line arguments.

(Notes)

(cat)dgsh-tee) (0 — N) (0 — M)

(cmp) (0—2) (0—1)

(comm) (0—2) (0—3)

(0—1) (1 — N)

(diff) (0 — N) (1)

(diff3) (0—3) (1)

(grep) (0—2) (0—4)

(0—2) (1)

(0 — N) (1) (Paste N input streams)

(perm) (1 — N) (1 — N)

(0 — N) (0—1)

(tee)dgsh-tee) (0 — N) (0 — M) (Only the – aoption is supported(dgsh-readval) (0) (1)

(dgsh-wrap) (0 — N) (0—1)

(dgsh-writeval) (1) (0)

Tool Inputs Outputs
No options are supported
Output streams in order: lines only in first file, lines only in second one, and lines in both files
cut With- Multistreamoutput each range into a different stream
Typically two inputs. Compare an arbitrary number of input streams with the- from-fileor- to-fileoptions
Available output streams (via arguments): matching files, non-matching files, matching lines, and non-matching lines
join
paste
Rearrange the order of N input streams
sort With the- moption, merge sort N input streams
Read a value from a socket
Wrap non-dgsh commands and negotiate on their behalf
Write a value to a socket

In addition, POSIX user commands that receive no input or only generate no output, when executed in adgshcontext are wrapped to specify the corresponding input or output capability. For example, anechocommand in a multipipe block will appear to receive no input, but will provide one output stream. By defaultDGSHautomatically wraps all other commands as filters.

Input-only
read, write.
Output-only

alias, en, basename, C 99, cal, cflow, command, date, df, dirname, du, echo, expr, find, getopts, ipcrm, jobs, ls, make, man, printf, ps, pwd, tty, type, ulimit, umask, uname, what, who.

No input and output

dbg, dcd, dchgrp, dchmod, dchown, dcp, denv, dfalse, dfg, dkill, dlink, dln, dmesg, dmkdir, dmkfifo, dmv, dnewgrp, dnice, dnohup, drenice, drm, drmdir, dsleep, dstrip, dtest, dtouch, dtrue, dunalias, dunlink, dwait, dyacc.

Finally, note that anydgshscript will accept and generate the number of inputs and outputs associated with the commands or multipipe blocks at its two endpoints.

Thedgshsuite has been tested under Debian and Ubuntu Linux, FreeBSD, and Mac OS X. A Cygwin port is underway.

An installation ofGraphVizwill allow you to visualize thedgshgraphs that you specify in your programs.

    

(Debian and Ubuntu GNU / Linux)     

(Prerequisites)

To compile and rundgshyou will need to have the following commands installed on your system:

make automake gcc libtool pkg-config texinfo help2man autopoint bison check gperf git xz-utils gettext

To testdgshyou will need to have the following commands installed in your system:

wbritish wamerican libfftw3-dev csh curl bzip2

    

Installation steps

Go through the following steps.

  1. Recursively clone the project’s source code through itsGitHub page.
    git clone --recursive https://github.com/dspinellis/dgsh.git
  2. Configurebashand the Unix tools adapted fordgsh.
    make config
  3. Compile all programs.
    make
  4. Install.
    sudo make install

By default, the program and its documentation are installed under/ usr / local. You can modify this by setting the (PREFIX) variable in the `config` step, for example:

make PREFIX=$ HOME config make make install

    

Testing

Issue the following command.

make test

    

(FreeBSD)     

(Prerequisites)

To compile and rundgshyou will need to have the following packages installed in your system:

devel / automake devel / bison devel / check devel / git devel / gmake devel / gperf misc / help2man print / texinfo shells / bash

To testdgshyou will need to have the following ports installed on your system:

archivers / bzip2 ftp / curl

    

Installation steps

Go through the following steps.

  1. Recursively clone the project’s source code through itsGitHub page.
    git clone --recursive https://github.com/dspinellis/dgsh.git
  2. Configurebashand the Unix tools adapted fordgsh.
    gmake config
  3. Compile all programs.
    gmake
  4. Install.
    sudo gmake install

By default, the program and its documentation are installed under/ usr / local. You can modify this by setting the (PREFIX) variable in the `config` step, for example:

gmake PREFIX=$ HOME config gmake gmake install

    

Testing

Issue the following command.

gmake test

    

These are the manual pages fordgsh, the associated helper programs and the API in formats suitable for browsing and printing. The commands are listed in the order of usefulness in everyday scenarios.

****************************************** (DGSH)

directed graph shell (HTML) ,PDF
dgsh-tee

buffer and copy or scatter standard input to one or more sinksHTML, (PDF)

dgsh-wrap

allow any filter program to participate in an dgsh pipelineHTML, (PDF)

dgsh-writeval

write values ​​to a data storeHTML, (PDF)

dgsh-readval

data store client (HTML) , (PDF)

dgsh-monitor

monitor data on a pipe (HTML) , (PDF)

dgsh-parallel

create a semi -homongeneous dgsh parallel processing block (HTML) , (PDF)

perm

(permute inputs to outputs) *********************************************************************************************** (HTML) , (PDF)

dgsh-httpval

provide data store values ​​through HTTPHTML, (PDF)

dgsh-merge -sum
merge key value pairs, summing the values ​​HTML, (PDF)
dgsh-conc

input or output pipe concentrator fordgshnegotiation (used internally)HTML, (PDF)

dgsh-enumerate

enumerate an arbitrary number of output channels (demonstration anddebuggingtool )HTML, (PDF) ********************************** [0-9]

dgsh_negotiate

(API for) DGSH- compatible programs to specify and obtain dgsh I / O file descriptorsHTML, (PDF)

Compression benchmark

Report file type, length, and compression performance for data received from the standard input. The data never touches the disk. Demonstrates the use of an output multipipe to source many commands from one followed by an input multipipe to sink to one command the output of many and the use of dgsh-tee that is used both to propagate the same input to many commands and collect output from many commands orderly in a way that is transparent to users.

#! / usr / bin / env dgsh  tee | {{ printf 'File type:  t' file -  printf 'Original size:  t' wc -c  printf 'xz:  t  t' xz -c | wc -c  printf 'bzip2:  t  t' bzip2 -c | wc -c  printf 'gzip:  t  t' gzip -c | wc -c }} | cat

Git commit statistics

Process the Git history, and list the authors and days of the week ordered by the number of their commits. Demonstrates streams and piping through a function.

#! / usr / bin / env dgsh  forder () { sort | uniq -c | sort -rn }  git log --format="% an:% ad" --date=default "$ @" | tee | {{ echo "Authors ordered by number of commits" # Order by frequency awk -F: '{print $ 1}' | forder  echo "Days ordered by number of commits" # Order by frequency awk -F: '{print substr ($ 2, 1, 3)}' | forder }} | cat

C code metrics

Process a directory containing C source code, and produce a summary of various metrics. Demonstrates nesting, commands without input.

#! / usr / bin / env dgsh  {{ # C and header code find "$ @"  (-name  *. c -or -name  *. h ) -type f -print0 | tee | {{  # Average file name length # Convert to newline separation for counting echo -n 'FNAMELEN:' tr \ 0 \ n | # Remove path sed 's | ^. * / ||' | # Maintain average awk '{s =length ($ 1); n   } END { if (n>0) print s / n; else print 0; } '  xargs -0 / bin / cat | tee | {{ # Remove strings and comments sed 's / # / @ / g; s / \ [\"'''] / @ / g; s / "[^"] *" / "" / g;' "s / '[^'] *' / '' / g "| cpp -P | tee | {{ # Structure definitions echo -n 'NSTRUCT:' egrep -c 'struct [   ] * {| struct [   ] * [a-zA-Z_] [a-zA-Z0-9_] * [       ] * {' #}} (match preceding openings)  # Type definitions echo -n 'NTYPEDEF:' grep -cw typedef  # Use of void echo -n 'NVOID:' grep -cw void  # Use of gets echo -n 'NGETS:' grep -cw gets  # Average identifier length echo -n 'IDLEN:' tr -cs 'A-Za-z0-9_' ' n' | sort -u | awk '/ ^ [A-Za-z] / {len =length ($ 1); n   } END { if (n>0) print len ​​/ n; else print 0; } ' }}  # Lines and characters echo -n 'CHLINESCHAR:' wc -lc | awk '{OFS=":"; print $ 1, $ 2} '  # Non-comment characters (rounded thousands) # -traditional avoids expansion of tabs # We round it to avoid failing due to minor # differences between preprocessors in regression # testing echo -n 'NCCHAR:' sed 's / # / @ / g' | cpp -traditional -P | wc -c | awk '{OFMT="% .0f"; print $ 1 / 1000} '  # Number of comments echo -n 'NCOMMENT:' egrep -c '/  * | //'  # Occurences of the word Copyright echo -n 'NCOPYRIGHT:' grep -ci copyright }} }}  # C files find "$ @" -name  *. c -type f -print0 | tee | {{ # Convert to newline separation for counting tr \ 0 \ n | tee | {{ # Number of C files echo -n 'NCFILE:' wc -l  # Number of directories containing C files echo -n 'NCDIR:' sed 's, / [^/] * $ ,,; s, ^. * / ,,' | sort -u | wc -l }}  # C code xargs -0 / bin / cat | tee | {{ # Lines and characters echo -n 'CLINESCHAR:' wc -lc | awk '{OFS=":"; print $ 1, $ 2} '  # C code without comments and strings sed 's / # / @ / g; s / \ [\"'''] / @ / g; s / "[^"] *" / "" / g;' "s / '[^'] *' / '' / g "| cpp -P | tee | {{ # Number of functions echo -n 'NFUNCTION:' grep -c '^ {'  # Number of gotos echo -n 'NGOTO:' grep -cw goto  # Occurrences of the register keyword echo -n 'NREGISTER:' grep -cw register  # Number of macro definitions echo -n 'NMACRO:' grep -c '@ [   ] * define [   ] [   ] * [a-zA-Z_] [a-zA-Z0-9_] * (' # Number of include directives echo -n 'NINCLUDE:' grep -c '@ [   ] * include'  # Number of constants echo -n 'NCONST:' grep -ohw '[0-9] [x0-9] [0-9a-f] *' | wc -l  }} }} }}  # Header files echo -n 'NHFILE:' find "$ @" -name  *. h -type f | wc -l  }} | # Gather and print the results cat

Find duplicate files

List the names of duplicate files in the specified directory. Demonstrates the combination of streams with a relational join.

#! / usr / bin / env dgsh  # Create list of files find "$ @" -type f |  # Produce lines of the form # MD5 (filename)=811 BFD4B  (f) ********************************************************************************************************************************************************************************************************* (e) ************************************************************************************************************************************************************************************** (DDC)  e 1899 E7 xargs openssl md5 |  # Convert each line into a "filename md5sum" pair sed 's / ^ MD5 (//; s /)=/ /' |  # Sort by MD5 sum sort -k2 |  tee | {{  # Print an MD5 sum for each file that appears more than once awk '{print $ 2}' | uniq -d  # Promote the stream to gather it cat }} | # Join the repeated MD5 sums with the corresponding file names # Join expects two inputs, second will come from scatter # XXX make streaming input identifiers transparent to users join -2 2 |  # Output same files on a single line awk ' BEGIN {ORS=""} $ 1!=Prev && prev {print " n"} END {if (prev) print " n"} {if (prev) print ""; prev=$ 1; print $ 2} '

Highlight misspelled words

Highlight the words that are misspelled in the command’s first argument. Demonstrates stream processing with multipipes and the avoidance of pass-through constructs to avoid deadlocks.

#! / usr / bin / env dgsh  export LC_ALL=C  tee | {{ # Find errors {{ # Obtain list of words in text tr -cs A-Za-z \ n | tr A-Z a-z | sort -u  # Ensure dictionary is compatibly sorted sort / usr / share / dict / words }} | # List errors as a set difference comm - 23  # Pass through text cat }} | grep --fixed-strings --file=- --ignore-case --color --word-regex --context=2

Word properties

Read text from the standard input and list words containing a two-letter palindrome, words containing four consonants, and words longer than 12 characters.

#! / usr / bin / env dgsh  # Consistent sorting across machines export LC_ALL=C  # Stream input from file cat $ 1 |  # Split input one word per line tr -cs a-zA-Z \ n | # Create list of unique words sort -u | tee | {{ # Pass through the original words cat  # List two-letter palindromes sed's /.*  (. )  (. )  2  1. * / p:  1  2-  2  1 /; t g '  # List four consecutive consonants sed -E's /.* ([^aeiouyAEIOUY] {4}). * / c:  1 /; t g '  # List length of words longer than 12 characters awk '{if (length ($ 1)>12) print "l:", length ($ 1); else print ""} ' }} | # Paste the four streams side-by-side paste | # List only words satisfying one or more properties fgrep:

Web log reporting

Creates a report for a fixed-size web log file read from the standard input. Demonstrates the combined use of multipipe blocks, writeval and readval to store and retrieve values, and functions in the scatter block. Used to measure throughput increase achieved through parallelism.

#! / usr / bin / env dgsh  # Output the top X elements of the input by the number of their occurrences # X is the first argument toplist () { uniq -c | sort -rn | head - $ 1 echo }  # Output the argument as a section header header () { echo echo "$ 1" echo "$ 1" | sed 's /./-/ g' }  # Consistent sorting export LC_ALL=C  export -f toplist export -f header   if [ -z "${DGSH_DRAW_EXIT}" ] then cat/ dev / null || gdate -f -  % a)' | sort | uniq -c | sort -rn }} }}  # Hour {{ call 'header "Accesses by Local Hour"' awk -F: '{print $ 2}' | sort | uniq -c }} }} dgsh-readval -q -s nAccess }} | cat
(Text properties)

Read text from the standard input and create files containing word, character, digram, and trigram frequencies.

#! / usr / bin / env dgsh  # Consistent sorting across machines export LC_ALL=C   # Convert input into a ranked frequency list ranked_frequency () { awk '{count [$1]   } END {for (i in count) print count [i], i} '| # We want the standard sort here sort -rn }  # Convert standard input to a ranked frequency list of specified n-grams ngram () { local N=$ 1  perl -ne 'for ($ i=0; $ idigram.txt ' # Trigram frequency call 'ngram 3>trigram.txt' # Word frequency call 'ranked_frequency>words.txt' }}  # Store number of characters to use in awk below wc -c | dgsh-writeval -s nchars  # Character frequency sed's /./&  / g '| # Print absolute call 'ranked_frequency' | awk 'BEGIN { "dgsh-readval -l -x -q -s nchars" | getline NCHARS OFMT="% .2g %%"} {print $ 1, $ 2, $ 1 / NCHARS * 100} '>character.txt }}

C / C symbols that should be static

Given as an argument a directory containing object files, show which symbols are declared with global visibility, but should have been declared with file-local (static) visibility instead. Demonstrates the use of dgsh-capable comm (1) to combine data from two sources.

#! / usr / bin / env dgsh  # Find object files find "$ 1" -name  *. o |  # Print defined symbols xargs nm |  tee | {{    # List all defined (exported) symbols   awk 'NF==3 && $ 2 ~ / [A-Z] / {print $ 3}' | sort    # List all undefined (imported) symbols   awk '$ 1=="U" {print $ 2}' | sort  }} | # Print exports that are not imported comm - 23

Hierarchy map

Given two directory hierarchies A and B passed as input arguments (where these represent a project at different parts of its lifetime) copy the files of hierarchy A to a new directory, passed as a third argument, corresponding to the structure of directories in B. Demonstrates the use ofjointo process results from two inputs and the use ofgatherto order asynchronously produced results.

#! / usr / bin / env dgsh  if [ -z "${DGSH_DRAW_EXIT}" -a ( ! -d "$1" -o ! -d "$2" -o -z "$3" ) ] then   echo "Usage: $ 0 dir-1 dir-2 new-dir-name" 1>& 2   exit 1 fi  NEWDIR="$ 3"  export LC_ALL=C  line_signatures () {   find $ 1 -type f -name '*. [chly]' -print |   # Split path name into directory and file   sed 's |  (. * ) /  ([^/] * ) |  1  2 |' |   while read dir file   do     # Print "directory filename content" of lines with     # at least one alphabetic character     # The fields are separated by and     sed -n "/ [a-z] / s | ^ | $ dir $ file | p" "$ dir / $ file"   done |   # Error: multi-character tab ' 001  001 '   sort -T `pwd` -t -k 2 }   export -f line_signatures   {{   # Generate the signatures for the two hierarchies   call 'line_signatures "$ 1"' - "$ 1"   call 'line_signatures "$ 1"' - "$ 2" }} |  # Join signatures on file name and content join -t -1 2 -2 2 |  # Print filename dir1 dir2 sed 's /// g' | awk -F 'BEGIN {OFS=""} {print $ 1, $ 3, $ 4}' |  # Unique occurrences sort -u | tee | {{   # Commands to copy   awk '{print "mkdir -p' $ NEWDIR '/" $ 3 ""}' |   sort -u    awk '{print "cp" $ 2 "/" $ 1 "' $ NEWDIR '/" $ 3 "/" $ 1 ""}' ' }} | # Order: first make directories, then copy files # TODO: dgsh-tee does not pass along first incoming stream cat | sh

Plot Git committer activity over time

Process the Git history, and create two PNG diagrams depicting committer activity over time. The most active committers appear at the center vertical of the diagram. Demonstrates image processing, mixining of synchronous and asynchronous processing in a scatter block, and the use of an dgsh-compliant join command.

#! / usr / bin / env dgsh  # Commit history in the form of ascending Unix timestamps, emails git log --pretty=tformat: '% at% ae' | # Filter records according to timestamp: keep (100000, now) seconds awk 'NF==2 & $ 1>100000 & $ 1 # Place committers left / right of the median # according to the number of their commits awk '{print $ 2}' | sort | uniq -c | sort -n | awk ' BEGIN { "dgsh-readval -l -x -q -s committers1" | getline NCOMMITTERS l=0; r=NCOMMITTERS;} {print NR% 2? l   : --r, $ 2} '| sort -k2 #}} | # Join committer positions with commit time stamps # based on committer email join -j 2 | ## Order by timestamp sort -k 2n | tee | {{ # Create portable bitmap echo 'P1'  {{ dgsh-readval -l -q -s committers2 dgsh-readval -l -q -s days }} | cat | tr ' n' '' | awk '{print $ 1, $ 2}'  perl -na -e ' BEGIN { open (my $ ncf, "- |", "dgsh-readval -l -x -q -s committers3"); $ ncommitters=; @empty [$ncommitters - 1]=0; @committers=@empty; } sub out { print join ("", map ($ _? "1": "0", @committers)), " n"; }  $ day=int ($ F [1] / 60 / 60 / 24); $ pday=$ day if (! defined ($ pday));  while ($ day!=$ pday) { out (); @committers=@empty; $ pday   ; }  $ committers [$F[2]]=1;  END {out (); } '' }} | cat | # Enlarge points into discs through morphological convolution pgmmorphconv -erodelarge.png # A smaller image pamscale -width 640 pnmtopng>small.png }}

Parallel word count

Count number of times each word appears in the specified input file (s) Demonstrates parallel execution mirroring the Hadoop WordCount example via the dgsh-parallel command. In contrast to GNU parallel, the block generated by dgsh-parallel has N input and output streams, which can be combined by any dgsh-compatible tool, such as dgsh-merge-sum or sort -m.

#! / usr / bin / env dgsh  # Number of processes N=8  # Collation order for sorting export LC_ALL=C  # Scatter input dgsh-tee -s | # Emulate Java's default StringTokenizer, sort, count dgsh-parallel -n $ N "tr -s ' t  n  r  f' ' n' | sort -S 512 M | uniq -c "| # Merge sorted counts by providing N input channels dgsh-merge-sum $ (for i in $ (seq $ N); do printf '

Venue author compare

Given the specification of two publication venues, read a compressed DBLP computer science bibliography from the standard input (e.g. piped from curl -s http://dblp.uni-trier.de/xml/dblp.xml.gz or from a locally cached copy) and output the number of papers published in each of the two venues as well as the number of authors who have published only in the first venue, the number who have published only in the second one, and authors who have published in both. The venues are specified through the script's first two command-line arguments as a DBLP key prefix, e.g. journals / acta /, conf / icse /, journals / software /, conf / iwpc /, or conf / msr /. Demonstrates the use of dgsh-wrap -e to have sed (1) create two output streams and the use of tee to copy a pair of streams into four ones.

#! / usr / bin / env dgsh  # Extract and sort author names sorted_authors () {   sed -n's / ([^/  1 / p '|   sort }  # Escape a string to make it a valid sed (1) pattern escape () {   echo "$ 1" | sed's /  ([/\] ) / \ 1 / g ' }  export -f sorted_authors  if [ ! "$2" -a ! "$DGSH_DOT_DRAW"]; then   echo "Usage: $ 0 key1 key2" 1>& 2   echo "Example: $ 0 conf / icse / journals / software /" 1>& 2   exit 1 fi  gzip -dc | # Output the two venue authors as two output streams dgsh-wrap -e sed -n " /^<.>/ w>| /^<.>/ w>| " | # 2 streams in 4 streams out: venue1, venue2, venue1, venue2 tee | {{   {{     echo -n "$ 1 papers:"     grep -c '^<. mdate="." key='     echo -n "$ 2 papers:"     grep -c ' authors common in both venues: cat>

Waves: 2D Fourier transforms

Create two graphs: 1) a broadened pulse and the real part of its 2D Fourier transform, and 2) a simulated air wave and the amplitude of its 2D Fourier transform. Demonstrates using the tools of the Madagascar shared research environment for computational data analysis in geophysics and related fields. Also demonstrates the use of two scatter blocks in the same script, and the used of named streams.

#! / usr / bin / env dgsh  mkdir -p Fig  # The SConstruct SideBySideIso "Result" method side_by_side_iso () { vppen size=r vpstyle=n gridnum=2,1 / dev / stdin $ * }  export -f side_by_side_iso  # A broadened pulse and the real part of its 2D Fourier transform sfspike n1=64 N2=64 d1=1 d2=1 nsp=2 k1=16, 17 k2=5.5 mag=16, 16  label1='time' label2='space' unit1=unit2=| sfsmooth rect2=2 | sfsmooth rect2=2 | tee | {{ sfgrey pclip=100 wanttitle=n  sffft1 | sffft3 axis=2 pad=1 | sfreal | tee | {{ sfwindow f1=1 | sfreverse which=3 cat }} | sfcat axis=1 "Fig / ft2dofpulse.vpl  # A simulated air wave and the amplitude of its 2D Fourier transform sfspike n1=64 D1=1 O1=32 nsp=4 k1=1,2,3,4 mag=1,3,3,1  label1='time' unit1=| sfspray n=32 d=1 o=0 | sfput label2=space | sflmostretch delay=0 v0=-1 | tee | {{ sfwindow f2=1 | sfreverse which=2 cat }} | sfcat axis=2 "Fig / airwave.vpl  wait

(****************************************** (Nuclear magnetic resonance processing)

Nuclear magnetic resonance in-phase / anti-phase channel conversion and processing in heteronuclear single quantum coherence spectroscopy. Demonstrate processing of NMR data using the NMRPipe family of programs.

#! / usr / bin / env dgsh  # The conversion is configured for the following file: # http://www.bmrb.wisc.edu/ftp/pub/bmrb/timedomain/bmr6443 / timedomain_data / c 13 - HSQC / June 11 - se - 6426 - CA.fid / fid var2pipe -in $ 1   -xN 1280 -yN 256   -xT 640 -yT 128   -xMODE Complex -yMODE Complex   -xSW 8000 -ySW 6000   -xOBS 599. 4489584 -yOBS 60. 7485301   -xCAR 4. 73 -yCAR  000   -xLAB 1H -yLAB 15 N   -ndim 2 -aq2D States  -verb | tee | {{   # IP / AP channel conversion   # See http://tech.groups.yahoo.com/group/nmrpipe/message/389   nmrPipe |   nmrPipe -fn SOL |   nmrPipe -fn SP -off 0.5 -end 0. 98 -pow 2 -c 0.5 |   nmrPipe -fn ZF -auto |   nmrPipe -fn FT |   nmrPipe -fn PS -p0 177 - p1 0.0 -di |   nmrPipe -fn EXT -left -sw -verb |   nmrPipe -fn TP |   nmrPipe -fn COADD -cList 1 0 -time |   nmrPipe -fn SP -off 0.5 -end 0. 98 -pow 1 -c 0.5 |   nmrPipe -fn ZF -auto |   nmrPipe -fn FT |   nmrPipe -fn PS -p0 0 -p1 0 -di |   nmrPipe -fn TP |   nmrPipe -fn POLY -auto -verb>A    nmrPipe |   nmrPipe -fn SOL |   nmrPipe -fn SP -off 0.5 -end 0. 98 -pow 2 -c 0.5 |   nmrPipe -fn ZF -auto |   nmrPipe -fn FT |   nmrPipe -fn PS -p0 177 - p1 0.0 -di |   nmrPipe -fn EXT -left -sw -verb |   nmrPipe -fn TP |   nmrPipe -fn COADD -cList 0 1 -time |   nmrPipe -fn SP -off 0.5 -end 0. 98 -pow 1 -c 0.5 |   nmrPipe -fn ZF -auto |   nmrPipe -fn FT |   nmrPipe -fn PS -p0 - 90 -p1 0 -di |   nmrPipe -fn TP |   nmrPipe -fn POLY -auto -verb>B  }}  # We use temporary files rather than streams, because # addNMR mmaps its input files. The diagram displayed in the # example shows the notional data flow. if [ -z "${DGSH_DRAW_EXIT}" ] then addNMR -in1 A -in2 B -out A   B.dgsh.ft2 -c1 1.0 -c2 1. 25 -add addNMR -in1 A -in2 B -out AB.dgsh.ft2 -c1 1.0 -c2 1. 25 -sub fi
****************************************** (FFT calculation)

Calculate the iterative FFT for n=8 in parallel. Demonstrates combined use of permute and multipipe blocks.

#! / usr / bin / env dgsh  dgsh-fft-input $ 1 | perm 1,5,3,7,2,6,4,8 | {{ {{ dgsh-w 1 0 dgsh-w 1 0 }} | perm 1,3,2,4 | {{ dgsh-w 2 0 dgsh-w 2 1 }}  {{ dgsh-w 1 0 dgsh-w 1 0 }} | perm 1,3,2,4 | {{ dgsh-w 2 0 dgsh-w 2 1 }} }} | perm 1,5,3,7,2,6,4,8 | {{ dgsh-w 3 0  dgsh-w 3 1  dgsh-w 3 2  dgsh-w 3 3 }} | perm 1,5,2,6,3,7,4,8 | cat

Reorder columns

Reorder columns in a CSV document. Demonstrates the combined use of tee, cut, and paste.

#! / usr / bin / env dgsh  tee | {{ cut -d, -f 5-6 -  cut -d, -f 2-4 - }} | paste -d,
(Listing listing) *******************************************

Windows-like DIR command for the current directory. Nothing that couldn't be done withls -l | AWK. Demonstrates use of wrapped commands with no input (df, echo).

#! / usr / bin / env dgsh  ls -n | tee | {{ # Reorder fields in DIR-like way awk '! / ^ total / {print $ 6, $ 7, $ 8, $ 1, sprintf ("% 8d", $ 5), $ 9}'  # Count number of files wc -l | tr -d \ n  # Print label for number of files echo -n 'File (s)'  # Tally number of bytes awk '{s =$ 5} END {printf ("% d bytes  n", s)}'  # Count number of directories grep -c '^ d' | tr -d \ n  # Print label for number of dirs and calculate free bytes df -h. | awk '! / Use% / {print "Dir (s)" $ 4 "bytes free"}' }} | cat

Brave Browser
(Read More)
Payeer

About admin

Leave a Reply

Your email address will not be published. Required fields are marked *