surplus/DEVELOPING.md
Mark Joshwel 4a2ea88fca docs: split up docs
these will be improved on a later date
2024-03-26 18:48:13 +00:00

8.5 KiB

the developers guide to surplus

quickstart

prerequisites:

alternatively, use devbox for a hermetic development environment powered by Nix.

devbox shell    # skip this if you aren't using devbox
hatch shell

common commands

  • hatch fmt
    formats and statically analyses the codebase

  • hatch run dev:check
    runs mypy and isort to check the codebase

  • hatch run dev:fix
    runs isort to fix imports

the technical details of surplus's output

Note

this is a breakdown of surplus's output when converting to shareable text. when converting to other output types, output may be different.

$ s+ --debug 8QJF+RP Singapore
surplus version 2.2.0, debug mode (latest@future, Tue 05 Sep 2023 23:38:59 +0800)
debug: parse_query: behaviour.query=['8QJF+RP', 'Singapore']
debug: _match_plus_code: portion_plus_code='8QJF+RP', portion_locality='Singapore'
debug: cli: query=Result(value=LocalCodeQuery(code='8QJF+RP', locality='Singapore'), error=None)
debug: latlong_result.get()=Latlong(latitude=1.3320625, longitude=103.7743125)
debug: location={...}
debug: _generate_text: split_iso3166_2=['SG', '03']
debug: _generate_text: using special key arrangements for 'SG-03' (Singapore)
debug: _generate_text: seen_names=['Ngee Ann Polytechnic', 'Clementi Road']
debug: _generate_text_line: [True]               -> True   --------  'Ngee Ann Polytechnic'
debug: _generate_text_line: [True]               -> True   --------  '535'
debug: _generate_text_line: [True]               -> True   --------  'Clementi Road'
debug: _generate_text_line: [True, True]         -> True   --------  'Bukit Timah'
debug: _generate_text_line: [False, True]        -> False  filtered  'Singapore'
debug: _generate_text_line: [True]               -> True   --------  '599489'
debug: _generate_text_line: [True]               -> True   --------  'Northwest'
debug: _generate_text_line: [True]               -> True   --------  'Singapore'
0       Ngee Ann Polytechnic
1
2
3       535 Clementi Road
4       Bukit Timah
5       599489
6       Northwest, Singapore
Ngee Ann Polytechnic
535 Clementi Road
Bukit Timah
599489
Northwest, Singapore

variables

  • variables behaviour.query, split_query and original_query

    (split_query and original_query are only shown if query is a latlong coordinate or query string)

    behaviour.query is the original query string or a list of strings from space-splitting the original query string passed to parse_query() for parsing

    split_query is the original query string split by spaces

    original_query is a single non-split string

    $ s+ Temasek Polytechnic
         -------------------
         query
    
    behaviour.query -> ['Temasek', 'Polytechnic']
    split_query     -> ['Temasek', 'Polytechnic']
    original_query  -> 'Temasek Polytechnic'
    
    >>> surplus("77Q4+7X Austin, Texas, USA", surplus.Behaviour())
    
    behaviour.query -> '77Q4+7X Austin, Texas, USA'
    split_query     -> ['77Q4+7X', 'Austin,', 'Texas,', 'USA']
    original_query  -> '77Q4+7X Austin, Texas, USA'
    
  • variables portion_plus_code and portion_locality

    (only shown if the query is a local code, not shown on full-length Plus Codes, latlong coordinates or string queries)

    represents the Plus Code and locality portions of a shortened Plus Code (referred to as a "local code" in the codebase) respectively

  • variable query

    query is a variable of type Result[Query]

    this variable is displayed to show what query type parse_query() has recognised, and if there were any errors during query parsing

  • expression latlong_result.get()=

    (only shown if the query is a Plus Code)

    the latitude longitude coordinates derived from the Plus Code

  • variable location

    the response dictionary from the reverser function passed to surplus()

    for more information on the reverser function, see SurplusReverserProtocol

  • variable split_iso3166_2 and special key arrangements

    a list of strings containing the split iso3166-2 code (country/subdivision identifier)

    if special key arrangements are available for the code, a line similar to the following will be shown:

    debug: _generate_text: using special key arrangements for 'SG-03' (Singapore)
    
  • variable seen_names

    a list of unique important names found in certain Nominatim keys used in final output lines 0-3

  • _generate_text_line seen name checks

    #                           filter function boolean list   status    element
    #                           =============================  ========  ======================
    debug: _generate_text_line: [True]               -> True   --------  'Ngee Ann Polytechnic'
    debug: _generate_text_line: [False, True]        -> False  filtered  'Singapore'
    

    a check is done on shareable text line 4 keys (SHAREABLE_TEXT_LINE_4_KEYS - general regional location) to reduce repeated elements found in seen_names

    reasoning is, if an element on line 4 (general regional location) is the exact same as a previously seen name, there is no need to include the element

    • filter function boolean list

      _generate_text_line, an internal function defined inside _generate_text can be passed a filter function as a way to filter out certain elements on a line

      # the filter used in _generate_text, for line 4's seen name checks
      filter=lambda ak: [
          # everything here should be True if the element is to be kept
          ak not in general_global_info,
          not any(True if (ak in sn) else False for sn in seen_names),
      ]
      

      general_global_info is a list of strings containing elements from line 6. (general global information)

    • status

      what all(filter(detail)) evaluates to, filter being the filter function passed to _generate_text_line and detail being the current element

    • element

      the current iteration from iterating through a list of strings containing elements from line 4. (general regional location)

line breakdown of shareable text output, accompanied by their Nominatim keys:

0       name of a place
1       building name
2       highway name
3       block/house/building number, house name, road
4       general regional location
5       postal code
6       general global information
  1. name of a place

    (usually important places or landmarks)

    • examples

      The University of Queensland
      Ngee Ann Polytechnic
      Botanic Gardens
      
    • nominatim keys

      emergency, historic, military, natural, landuse, place, railway, man_made,
      aerialway, boundary, amenity, aeroway, club, craft, leisure, office, mountain_pass,
      shop, tourism, bridge, tunnel, waterway
      
  2. building name

    • examples

      Novena Square Office Tower A
      Visitor Centre
      
    • nominatim keys

      building
      
  3. highway name

    • examples

      Marina Coastal Expressway
      Lornie Highway
      
    • nominatim keys

      highway
      
  4. block/house/building number, house name, road

    • examples

      535 Clementi Road
      Macquarie Street
      Braddell Road
      
    • nominatim keys

      house_number, house_name, road
      
  5. general regional location

    • examples

      St Lucia, Greater Brisbane
      The Drag, Austin
      Toa Payoh Crest
      
    • nominatim keys

      residential, neighbourhood, allotments, quarter, city_district, district, borough,
      suburb, subdivision, municipality, city, town, village
      
  6. postal code

    • examples

      310131
      78705
      4066
      
    • nominatim key

      postcode
      
  7. general global information

    • examples

      Travis County, Texas, United States
      Southeast, Singapore
      Queensland, Australia
      
    • nominatim keys

      region, county, state, state_district, country, continent