Tuesday, January 23rd, 2024

The simplest things are often the truest.

— Richard Bach

  • Terraform dynamic blocks are useful for conditional blocks.
  • I should build a tagging system for TTPs so that the vault is actually useful for identifying the next step.
    • #ttp/as/...: minimum credential / privileges needed to carry out the attack
      • *nix systems
        • unixuser
        • root
      • AD environments
        • localuser
        • domainuser
        • service
        • localadmin
        • domainadmin
    • #ttp/to/...: if the TTP is a privilege escalation, this tag would describe the target privilege level
      • same principals as above

ECS165A Lecture 5:

  • row-based layout: Data is organized by records. Columnar data belonging to the same record is aggregated.
    • If a filtering index involves more than 5% of records (e.g. age > 20, but 80% of records have above 20), consider not using an index since at this point it does not provide benefits over linear scan.
  • column-based laout: Data is organized by columns. The same type of data (i.e. data in the same column) is aggregated together & ordered by row ID/key.

L-Store

  • To accommodate for very high data velocity, we split database systems into two roles: OLTP and OLAP.
  • OLTP (online transaction processing): write-optimized database system
    • Over-provision to prepare for the worst.
    • Usually row-based layout, capable of doing uncompressed in-place updates.
    • Examples: IBM DB2, Spark, Hadoop
    • Pros: Database system could handle very high load.
    • Cons: When usage is low, a lot of computing power is wasted sitting idle. Some companies overcame this (e.g. Amazon) by providing cloud services, virtualizing and selling unused system resources.
  • OLAP (online analytical processing): read/analytics-optimized database system
    • OLAP is used to power business intelligence products, e.g. PowerBI.
    • Examples: SQL Server, Oracle, IBM DB2…
    • Usually column-based layout, which produces highly compressed & readonly pages (high data homogeneity gives high compression rate). Compression really speeds up disk access.
    • Pros: Much more performant analytics
    • Cons: Needs to transfer up-to-date data from the OLTP database (Extract-Transform-Load/ETL). Cannot do analytics or policy adjustments (e.g. pricing) in realtime.
  • There are way too many database products for different usecases. It is becoming more and more difficult to choose to right product.
  • Can we have a single database that have OLTP and OLAP at the same time?
  • Traditional multi-version indexing
    • Go without in-place update, with the caveat of much slower write performance: To go without in-place updates, write modified data as a new record version somewhere else on disk, and update all indices to point to new location on disk (point indices to the new row version). This makes updating records more expensive from the get go.
    • To improve the write performance, we add a layer of indirection. Use a faster disk (e.g. SSD) to serve as the lookup table from logical ID (ID for the record) to row ID (points to physical location to HDD where actual data resides). This decouples indices from a record’s physical location and avoids multiple disk writes (one for each index) on a single record update.

ECS122A Lecture 5

Midterm next week

Study guide is available on Canvas

  • from last lectures
  • master theorem
  • maximum sum subarray problem
    • Goal: Find the subarray with the maximum sum
    • Naive divide and conquer: find maximum sum subarray in subproblems, then choose the best out of two subproblems.
      • Incorrect algorithm. Cannot find subarrays that overlaps across subproblems
    • bruteforce:
      • three nested loops:
      • iterate over all possible starting point for subarray:
      • iterate over all possible ending point for subarray:
      • iterate over all element in the current subarray:
      • total
    • More sophisticated divide and conquer
      • Code available in lecture notes
      • Base case: one element subarray—return the tuple (begin, end, A[begin])
      • Divide into subarray
      • Calculate if subarray
    • Kadane’s algorithm: