SRE/systemdesig_template.rst

6.9 KiB

System Design Fundamentals

Intro to Interviews

  1. Asking Questions
    1. What features are involved, the stack, whats bloat, where do the troubles if any lie?
    2. What sort of scaling is to be accounted for ?

Features

  • The feature set

    Make note of the specs carefully. Feel free to annotate a bit

  • Define APIs and Endpoints.

    Knowing what routes will be hit by the public and what sort of auth is being used is essential

  • Availability

    What to do if a host goes down, what to do if the entire data centre goes down. If already exists then enquire about the current plans and also ascertain the amount of availability cared about

  • Latency Performance

    Public facing services require snappy responses. This may be kept track of with monitoring tools.

  • Scalability
  • Durability

    At times data can be stored in a db securely without loss and compromises what sort of dbs am I working with.

  • Class Diagrams

    OOP diagrams basically, they may ask to design some parking lot or elevator systems

  • Security and Privacy

    TLDR: When users and auth are required these practices will become sacrosanct

  • Cost Effective

    Lean systems are not only cost effective but easier to maintain. KISS. Check Pros and Cons for current and alt flows

Concepts

Vert vs Horizontal Scaling

  • Vertical Scaling is adding more juice to the host to handle the extra load
    • You can't go beyond a certain point
    • Gets expensive
    • All eggs dilemma
  • Horizontal Scaling is adding distributed hosts to share the load
    • This is a technically more challenging problem since it all needs to sync and have good routing
    • Typical distributed systems others

CAP Theorem (Brewer)

CAP stands for:

  1. Consistency
  2. Availability
  3. Partition Tolerance ( need it cuz cant have packets being lost )
  • Traditional DBs choose Consistency over Availability
  • NoSQL preferes if choosen the opposite

ACID vs BASE

ref:

  1. ACID - (RDBMS) Atomic, Consistent, Isolated, and Durable
  2. BASE - (NoSQL) Basic Availability, Soft-State, Eventual Consistency

Parting or Sharding

refs:

Locking (DBs); Optimistic vs Pessimistic

refs:

  • https://foo.bar
  • Optimistic Locking - When you are about to commit a transaction you check if no other transaction updated the specific "record" you are working on.
  • Pessimistic Locking - Lock it all and then commit the transaction
  • NOTE: Both have Pros and Cons. Learn when to use which

Strong Consistency vs Eventual Consistency

  • SConst Reads will see the latest writes (RDBMS)
  • EConst Reads some writes but eventually sees the latest write (NoSQL)

RDBMS vs NoSQL

  • NoSQL getting really rad nowadays but dont meh.

Types of NoSQL

  • key-value
  • wide column
  • document based
  • graph based

--- Note current conf

---Caching -------

  • Every node does its own caching; not shared
  • Suited cache?? this shares cache betn nods

Points of Concern:

  • Cache is mem so keep small
  • cannot be accepted as source of truth

Data Centers/Racks/Hosts

Key points of interest may be:

  1. Latency between hosts or racks
  2. What are the contigency plans for when racks or even DCs go down!

RAM/CPU/HDD/Internet Bandwidth

  • Everything must be design to comply well withing these constraints.
  • throughput latency improvement

Random and/or Sequential Read/Write

refs:

HTTP vs HTTP2 vs WebSockets

  • presumably websockets trump all since they are bidirectional etc.
  • HTTP2 tries to cover deficiencies of HTTP like allowing for more than one request per connection (limit?)

TCP/IP Stack

Have understood basics

the various (iirc 7?) layers etc might be good to give it a gander once more.

IPV4 vs IPV6

  • runnin out of ipv4 addys
  • ipv4 = 32 bits vs ipv6 = 128bits (remember go-discord-irc conundrum)
  • Some power systems

TCP vs UDP

  • UDP is super fast, dont care some packet loss. Good for audio/video streams
  • TCP is useful to ensure the data integrity was maintained during transit and is inherently a bit slow.

DNS Lookup

  • KNOWN we run DNS servers
  • Understand DNS cache poisoning (not a threat now)
  • Using PowerDNS with mysql for db replication
  • I still feel like the tcpdump output scares me. So i need to understand for eg. ARP Poisoning
  • DynDNS explore

HTTPS and TLS

Note: Can be elaborated upon

  • People who use http should be sentenced to staying away from computers.
  • I have understood the fundamentals of the TLS handshake but Cryptography is a complex subject and while i understand the Diffie-Hellman key exchange and stream and block ciphers in general TLS in its most complicated forms is a bit of a mystery. (REFINE BEFORE TALK).

PKI and CAs

  • We know that CAs verify that the certs or pubkeys are actually good to go ie they are recognized and authorized.
  • Prevents MITM
  • See Georg for more if- eg avao;

Symmetric vs Asymmetric Encryption

  • sym - AES
  • asym - PKI (computationally expensive)

Load Balancers

  • operate at L4 or rather mostly L7
  • Nginx? How does it fit

CDN and Edge

  • Lets say you want to stream a movie and i have it in my datacentre half way around the globe. CDNs allow for placing content/resource closer to you for better performance and latency along with costing the org not as many long distance clogged lines
  • Edge builds on this has a dedicated network to further speed the process up (READ MORE)

Bloom Filters and Count-min sketch

  • Space efficient probablisitic based data structures.
  • BF - Used to decide to an element is a part of a set or not. May have false positives but never false negatives. Very Space Efficient (READ MORE)
  • CMS - Frequency event counter. fraction of space used to probablistically arrive at close to the accurate answer.

Paxos

VMS and Containers

  • vm is system on system, containers self contained

Map Reduce

Concurrency, threading