#4 the terminal

text-scraping and editing

      ||                                          || 
      ||          ||              ||    ||               ||      
||   ||          ||                         ||    ||    ||      ||    ||    ||    ||
||   ||    ||          ||          ||          ||    ||      ||    ||          ||
||   ||    ||    ||    ||    ||    ||    ||                  ||    ||    ||    ||
||   ||                                                                              ||
||                                                                                   
||                                                                              
||                                       
||               ||    ||          ||    ||      
||   ||    ||    ||    ||    ||    ||    ||            
||   ||    ||    || a  || series   || of || zines    ||    
||   ||    ||    ||    ||    ||    ||    ||    ||    ||    ||      ||    ||          ||
|| inputs  ||     <--      ||      <--   ||       <--    ||      <--     ||  inputs  ||
||         ||   hardwares  ||       OS   ||  softwares ||    users ||      ||
|| outputs ||      -->     ||       -->  ||    -->     ||       -->  ||  outputs ||  ||
||   ||    ||    ||    ||    ||    ||    ||    ||    ||    ||      ||
||   ||    ||  the graphic designer practice deconstructed to its rawer shape    ||  ||
        
"web harvesting"

Welcome !

Text-editing in the terminal challenges traditional ways to treat text on a white A4 canvas. The terminal can be used for scraping content from the web as much as automating text layout into patterns. The methodologies presented in this zine, stand as an introduction to alternative editing techniques. Because, disclaimer, the very best way to layout text without DTP softwares (adobe, etc) is by using mark-up language such as HTML styled with CSS, programming languages built for web editing. These require a deeper learning of these languages, which can't be achieved from these zines.

This zine #4 will bring you to a funny web-scraping introduction, followed by two ways of automating the editing of the text. At the end, a bonus introduces you to vim a command-line-based text-editor !!! expert level ;)

notes

in practice

#1 web scraping

Web scraping or web harvesting is the automation of the structured data extraction process from websites.

An interesting way to get introduce to web scraping, is to use a Python Library which will extract content from wikipedia pages. The first step is to install the library.

-> to understand better the system of packages in the terminal, you can refer to the zine #3, page 4.

->In 1989, Tim Berners-Lee, a British scientist conceived the World Wide Web. The static informations were displayed in html, which made it easy for developers to write scripts that could extract data programmatically. But the evolution of the web with dynamic websites challenged the traditional scraping techniques. Expensive dedicated software has been developed, now pushed further for the building of machine-learning training datasets.

In its essence web scraping is illegal. But no regulations are binding; results of court cases have proved that web scraping legacy is a case-by-case matter. Web scraping often involves screen-scraping, which is the collection of pre-rendered information from the front end. This activity won't affect a website's technical angle. Plus, data scraped this way are often unprotected, and anyone can collect them.
So for an individual practice, space still exists for a bit of scrapping.

for Macos

Homebrew, the macos package manager :

To install it, simply open a terminal window and run :

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/
Homebrew/install/HEAD/install.sh)"

/!\ if an error occurs after that, read what the terminal indicates you to do -> there is often a text to copy / or the computer password to fill in


Python

brew install python

To check the version installed, run :

python3 --version


wikipedia

brew install wikipedia

for Windows

Python should already be installed. Verify by :

python3 --version

Then check if pip is installed :

pip --version

if Pip is not installed type :

python get-pip.py


wikipedia

pip install wikipedia

in pratice

extract a summary

/!\ an internet connection is required

1. Extract the summary of a wikipedia web page

- Open a terminal window

- Open a python window by typing (depending of the version installed)

python OR python3

- Call the wikipedia package

import wikipedia

- Press enter, and ask for the summary

print(wikipedia.summary("piracy"))

Here it is ! A full summary of the webpage is displayed !

"Piracy is an act of robbery or criminal violence by ship or boat-borne attackers upon another ship or a coastal area, typically with the goal of stealing cargo and other valuable goods. Those who conduct acts of piracy are called pirates, and vessels used for piracy are called pirate ships. The earliest documented instances of piracy were in the 14th century BC, when the Sea Peoples, a group of ocean raiders, attacked the ships of the Aegean and Mediterranean civilisations. [...]"

-> It mainly gives the first sentences of the webpage.



2. Select the number of sentences to make it shorter :

print(wikipedia.summary("piracy", sentences=2))

"Piracy is an act of robbery or criminal violence by ship or boat-borne attackers upon another ship or a coastal area, typically with the goal of stealing cargo and valuable goods, or taking hostages. Those who conduct acts of piracy are called pirates, and vessels used for piracy are called pirate ships."

first illustration of the "Piracy" wikipedia page : "The traditional "Jolly Roger" flag of piracy"
in practice

browse pages from a keyword

Keywords, like "piracy", can refer to many pages. By asking the summary, the library decides to display the most obvious one. But, we can first search the different pages in which our keyword is mentionned, to then focus on the one we are interesting in.

1. List of results by keywords :

result = wikipedia.search("piracy")
print(result)

['Piracy', 'Copyright infringement', 'Piracy off the coast of Somalia', 'Online piracy', 'Golden Age of Piracy', 'Piracy in the Caribbean', 'Piracy in the 21st century', 'Anti-piracy', 'Pirate (disambiguation)', '1680s in piracy']

The results are displayed in a list []. Each results is indexed by a number, starting from 0.
result[0] : "Piracy"
result[1] : "Copyright infringement"
result[7] : "Anti-piracy"



2. Summary of a result : let's use the commands seen previously to display the result of 'Online Privacy' :

print(wikipedia.summary(result[3], sentences=4))

"Online piracy or software piracy is the practice of downloading and distributing copyrighted works digitally without permission, such as music, movies or software.
== History ==
Nathan Fisk traces the origins of modern online piracy back to similar problems posed by the advent of the printing press. Quoting from legal standards in MGM Studios, Inc. v."

->test it on different themes and webpages !

-> to leave Python writing mode tap Ctrl+Z on windows, and Ctrl+S on mac.

into practice - beginner

#2 text-editing

Once this text is copied, let's try to edit it in a poster ! For this, open a new terminal window and type :

1. Install dependancies

pip install asciiwriter

pip install argparse


2. Download the script

- Get on becomingnoob.com

- Open the door, and click on "workshop", download the document called "text_lines.py"

- Place it in your desktop, then type

cd Desktop


3. Run the command

- To achieve the following effect, run this command

python text_lines.py "Becoming A Noob"

The result should be displayed in the terminal !


4. Adapt the variables

- Download a text editor : Visual Studio Code or Sublime text

- Open "text_lines.py" in the text editor

- Play with the width, height and amplitudes of the outcome line 13 and 20 !

                                            B
                                    ee ee e ee ee e
                            c  c  c  c  c   c  c  c  c  c
                      o   o    o   o    o    o   o    o   o    o
              m     m     m     m     m     m    m     m     m     m
        i      i      i      i      i       i      i      i      i      i
            n       n       n       n        n       n       n       n       n
        g        g        g        g        g       g        g        g        g

              a         a         a         a        a         a         a

              n         n         n         n        n         n         n
                o        o         o         o        o         o        o
                  o        o        o        o       o        o        o
            a       b       b       b        b       b       b       b       a
        n             B      B      B       B      B      B      B
              g     n     e     e     e     e    e     e     e     n
                          o    c   c    c    c   c    c   c    o
                            a  o  o  o  o   o  o  o  o  o
                                      b mm m mm mm b
                                            iB
                                      e nn n nn nn eo
                                c  g  g  g   g  g  g  g  c  o
                          o                                    o   b
                    m     a     a     a     a    a     a     a     m     B
                i                                                        i      e
            n       n       n       n        n       n       n       n       n
        g        o        o        o        o       o        o        o        g
                o        o         o         o        o         o        o
              b         b         b         b        b         b         b
              B         B         B         B         B         B         B
              e         e         e         e        e         e         e
                c        c         c         c        c         c        c
                  o        o        o        o       o        o        o
            a       m       m       m        m       m       m       m       a
                      i      i      i       i      i      i      i             c
                    n     n     n     n     n    n     n     n     n     o
                          o    g   g    g    g   g    g   g    o   m
                                o                        o  i
                                      b aa a aa aa bn
                                            BB
                                      e nn n nn nn e
                            a  c  o  o  o   o  o  o  o  c
                          o    o   o    o    o   o    o   o    o
              n     m     b     b     b     b    b     b     b     m
        o      i      B      B      B       B      B      B      B      i
            n       e       e       e        e       e       e       e       n
        g        c        c        c        c       c        c        c        g
                o        o         o         o        o         o        o
              m         m         m         m        m         m         m
              i         i         i         i         i         i         i
              n         n         n         n        n         n         n
                g        g         g         g        g         g        g

            a       a       a       a        a       a       a       a       a
        o
              b     n     n     n     n     n    n     n     n     n
                      B   o    o   o    o    o   o    o   o    o
                            e  o  o  o  o   o  o  o  o  o
                                    cb bb b bb bb b
    
into practice - advanced

#3 text-editing

Another technic to edit the text extracted previously would be to integrate it as an ascii Image. For this, open a new terminal window and type :

1. Install dependancies

pip install pillow

pip install reportlab

pip install argparse


2. Download the script

- Get on becomingnoob.com

- Open the door, and click on "workshop", download the document called "ascii_art.py"

- Place it in a new folder in your desktop called "Ascii-test", then type

cd Desktop\Ascii-test

- Add the image you want to work with in the folder

- Add a .txt file with the text you want to work with in folder


3. Run the command

- To achieve the following effect, run this command

python ascii_art_printer.py 3.jpg .\webscraping.txt --width 50

The result is displayed in the terminal. You can also print it directly on the default printer :

python ascii_art_printer.py 3.jpg .\webscraping.txt --width 50 --print


4. Adapt the variables in the command

- Play with the width and font size of your output at the end of your command :

--width 150 --font_size 5

150 as a number of caracters and 5 as the font size in pt


5. Adapt the variables in a text editor

- Download a text editor : Visual Studio Code or Sublime text

- Open "ascii-art.py" in the text editor

- Play with the brightness on line 23 !

if brightness > 0.85:

 The symptoms of accelerated crisis are widely recognized. Multiple attempts have been made to explai
 n them. I believe that this crisis is rooted in a major twofold experiment which has failed, and I c
 laim that the resolution of the crisis begins with a recognition of the failure. For a hundred years
 we have tried to make mac ines work for men and to school men for life in their service. Now it tur
 ns out that machines do not “work” and that people cannot be schooled for a life at the service of m
 achines. The hypothesis on which the experiment was built must now be discarded. The hypothesis was 
that machines can replace slaves. Ivan Illich - Tools for Conviviality http://clevercycles.com/tools
 _for_conviviali y/ 17 of 126 10/11/07 9:20 AM The evidence shows that, used for this purpose, machin
 es enslave men. Neither a di  atorial proletariat nor a leisure mass can escape the dominion of cons
 tantly expanding industrial t ols. The crisis can b                             invert the present d
 eep structure of tools; if we give people tools th                                 with high, indepe
 ndent efficiency, thus simultaneously eliminating                                 sters and enhancin
 g each person’s range of freedom. People need new   ol     wo    i     t     ha  tools that “work” f
 or them. They need technology to make the most of   e     gy     im   nat    ea   has, rather than m
 ore well-programmed energy slaves. I believe that   ci    mu    e     nst    ed  o enlarge the contr
 ibution of autonomous individuals and primary grou   t    e     l     ct    ess  f a new system of p
 roduction designed to satisfy the human needs whic  it    o     rm    .     act  the institutions of
 industrial society do just the opposite. As the p  er    ma    es    re    ,     role of persons mo
 re and more decreases to that of mere consumers. I  iv    ls    d     s     ov  and to dwell. They n
 eed remedies for their diseases and means to commu  ca    it    e     he    eo  e cannot make all th
 ese things for themselves. They depend on being s  pli   wit   bje    an    rv  es which vary from c
 ulture to culture. Some people depend on the supp   of   od  n  ot   s o    e   pply of ball bearing
 s. People need not only to obta n things, they ne   ab    al   he    edo   o m  e things among which
 they can live, to give shape to them according t  th    ow    st    an     pu  them to use in carin
 g for and about others. Prisoners in rich count     o     h    ac     t    re   ings and services th
 an members  f their families, but they have no     in     t    s     to     a   and c           e wh
 at to do with them. Their punishment consists i    in    pr     o    at    ha   call             y.”
 They are degraded to the status of mere consum     I    os    e      “    iv  lity”              th
 e opposite of Ivan Illich - To ls for Convivialit  ht    /c   erc   es.    to  s_for_              1
 8 of 126 10/11/07 9:20 AM industrial productivity                    n     no  us and    at      ter
 course among persons, and the interc ur   of per                               d    s i  contrast wi
 th the conditioned response of pers ns    th   emands m                         an   y            en
 vironment. I consider conviviali      b  in   idual freedom realized in per   al in   dependence and
 , as such, an intrinsic ethica  v  ue. I be  eve that, in any society, as      vial          uced be
 low a certain level, no amount         tria  productivity can eff              fy th        it creat
 es among society’s members. Pre             ional purposes, which               ial p   uct         
the expense of convivial effec       s  ar  a major facto                       and me  inglessness 
that plague contemporary soci   . T   incr  sing demand for products has come    defin  soci      pr
 ocess. I will suggest how t     re ent tre   can be reversed                     ce an   echnology c
 an be used to endow human acti          un  ecedented effective                  l wou   permit the 
evolution of a life style and of a p  iti    system which give priority to the    tect   , th  maxim
 um use, and the enj     t of the one reso   e that is al  s                       mon    l people: p
 ersonal energy und                rol. I    l argue that we can no                 wo   effectively 
without public co                s   d in    utions that curtail or negate any     on   right to the
 creative use of hi             gy. For t    purpose we need procedu                    ontrols over
 the tools of socie y ar            d an    verned by political proc                   decisions by 
ex  rts. The transition     ocial    can    be effected without an inversion of o      sent institut
     and the s  stitution of convivial f    ndustrial tools. At the                   ool  g of soci
    will rem    a pious dream unless the   eals of socialist jus                      e that the pre
   t crisis o      major institutions ou  t to be welcomed as a crisis of revoluti    y liberation b
   use our  r              ions abridge   sic human Ivan Illi                         ty http://clev
  cycles.com/  o                it / 19  f 126 10/11/07 9:20                          f pr  iding pe
 ple wit       instituti            .   is  orld-wide crisis                            ca  lead to 
a new consci usne    bout the nat  e    tools and to majority action for their contr    If tools are
 not cont  lled polit        they will be  anaged in a belat      hno    ic res         disaster. Fr
 eedom and dign      ll c     u  to dissolve into an unpreced                            hi  tools. A
 s an alternative    t   n      c disaster, I propose the vis                            A convivial 
so iety would be the resu   of social arrangements that guarantee for eac           e    t ample and
 f ee    ess to the tools of the community and limit this fr  dom only in favor of an    r member’s 
  ual           t present people tend to relinquish the tas                                professio
 na    i       y t     er power to politicians who promise t                               iver this 
f       Th y acc            g range  f power levels in soc                                  maintain
 h gh ou  uts. P             itutio    hemselves become draft mechanisms to p              to compli
 city with  utput goa     hat    ri     omes to be subordin   d to what is good for inst    ions. Jus
 tice is debased to mean  he equa      ribution of institu                                   n    is 
intolerably red ced  y a societ  th   defines the maximum                                        he 
largest consumption of industrial goods. Alternate politi                                         f 
 ermitting all people to define the images of their own f                                          t
   xclu e the design of artifacts and rules that are obstacles to the exercis                    ree
 om.       olitics would limit the scope of tools as demanded by the protection of three values: sur
 v  al  justice, and self-defined work. I take these values to be fundamental to any convivial societ
 y, ho  ver different one such society might be from another in practice, institutions, or rationale.
 Each of these three values imposes its own limits on tools. The Ivan Illich - Tools for Conviviali
    
extract of Tools of Conviviality by Ivan Illitch, in a beach chair
bonus

#4 text-editing with VIM

=======================================================================================
  =    W e l c o m e   t o   t h e   V I M   T u t o r    -    Version 1.7      =
=======================================================================================
        

Vim a highly configurable text editor built to enable efficient text editing. It is very useful to write and modify codes or texts in every text format possible (.html, .txt, .py, .css, etc).

Its text-based interface refers to the terminal's, and Vim can be opened directly from it as well. It makes it easier to naviguate from one and another. It is based on the same principle : the user can activate a multitude of commands, but they have to be known. They are no buttons for anything, everything happens from the keyboard.

Install VIM

->On windows, vim must be installed and opened as an application. Check the Vim website (vim.org/download.php) to follow the instructions.

-> On mac just run : brew install vim


in practice

Because of the text-based interface, it is very easy to feel lost in vim the first time using it. In fact it requires to already know a few commands to be able to do a simple text-editing. That's why "vimtutor" has been developped. It is openable directly from the terminal typing :vimtutor

"vi" refers to vim

A window opens and makes the user going through various exercises to comprehend the essential commands and keyboard actions.

Modes : Vim has 3 editing modes. They are accessible via keyboard shortcuts.

  • Command (default) : to write commands
    type esc
  • Insert : to add text
    type I or i, 0, A
  • Visual : to select
    type v or V


Naviguation : To move around, select text, the faster in vim is tu use the keyboard (again) and not the mouse. Quite easy to remember, it has to become a part of the user's muscle memory to be really efficent, doable with some practice.

 
        /\         press
         k         k to go up
   < h       l >   h to go left
                   l to go right
         j         j to go down
         v
      

Buffers : are editing workflows wich enable to open temporary space in the memory to store opened files, open and edit parallel files, check a document, etc. The switch from a buffer to another is quite easy and improve considerably the speed and efficiency of the editor.

The most usefull one at first can be called by typing :help. And a buffer will be opened with an access to Vim help ressources, where you can scroll in all the explanation of possible commands.

- vim first language : in command mode "esc"

command description options
:q quit :wq : save and quit
:w save or save as :w! : overwrite
:w file.name : save as...
: introduce a command
u, U undo the last change, Undo for the whole line ctrl+R : redo
:help "command" open the help ressources
:bd close the buffer
ctrl+Z leave temporarily vim, to the terminal
fg foreground : comes back to vim
ctrl+W switch to the other buffer
:split "file.name" open a window in vim :close! : close the window
x delete what's under the cursor
dd delete the whole line
y yank : copy in the clipboard, what has been selected (in Visual mode) p : put -> paste
A append : goes to Insert mode at the end of the line 0, idem but at the start of the line
G go to the top o the document gg : go to the bottom
r"x" change the caracter under the cursor to a new one
:w | !Open% open a preview in the default browser
:set nu show line numbers
:set textwidth="x" set the textwidth to x caracters gq : apply to the selection
:x+y>> tabulate lines x to y x2 (>>)
/ introduce a research :set hlsearch : highlight the search in the text

#5 archiving the work

To mark everyone's entrance in the noob community, it is better that every participant can archive their achievements digitally and on paper.

  • select 2 outcomes .png or .jpg that have been realised during the workshop : images modified, edited, screenshots from the terminal, etc.
  • name them like this : date + title (or name of technique used) + your name
  • DD-MM-YY_title_your-name.jpg

  • A usb stick is dedicated to collect them
  • Connect to the printer and print one of them twice : one for you, one for the Noob's archive

sources

  • Scraping Wikipedia Pages with Python Wikipedia Library, on the Python Tutorial blog, well.sr website, 2020
  • What is web scraping and how to use it, on geekforgeeks website, 2025
  • Beautiful soup documentation, on the Bs4 website
  • Implementing Web Scraping in Python with BeautifulSoup, on geekforgeeks website, 2024
  • The Legalities of Web Scraping: What’s Allowed and What’s Not, on Automatio website, 2021
  • What Is a Web Crawler, and How Does It Work?, by Vann Vincente, on Howtogeek website, 2021
  • Webcrawlers, on wikipedia
  • Working With Buffers in Vim: A Guide, by Igor Irianto, on Built-in website, 2024