Sigma OA 面试题解析:实现简化版 SigmaTable

16次阅读
没有评论

Implement a simplified version of Sigma’s core product.

We provide a few starter classes, Column, DataWarehouseTable, and DataWarehouse, and your task is to implement SigmaTable.

Background:

Sigma is a user interface that sits on top of various cloud data warehouses. We provide a unified way to interact with your data, despite differences in warehouse implementations.

Data warehouses contain tables, usually similar to SQL tables. Sigma has tables as well, in which you can create columns that are functions of other columns (for example, [My Sum Col] = [Data 1] + [Data 2]). We also have the ability to add summaries to our tables, which are aggregations (like a Min or Max) of all values in a column.

However, instead of storing or altering customer data, we store the operations that need to be performed on the base table to get the desired result. When the customer accesses a table in Sigma, we compile the operations to SQL, which we execute in the customer’s data warehouse against the latest version of their data.

The customer’s data may be private or sensitive, so storing their data on Sigma’s servers would be a security violation.

Implementation Details:

  • For the security reasons described above, do not store any warehouse data other than the get_data_warehouse method in the SigmaTable class. Do not store data computed from warehouse data in the SigmaTable either.
  • Do not store additional data in the data warehouse.
  • Trying to use a missing column should raise a ValueError.
  • Missing data should be treated as 0.
  • You can store whatever metadata you want in the SigmaTable.
  • In cases of ambiguity, make reasonable choices.

Testing Details:

  • We provide a small amount of test code that covers the basic functionality of the SigmaTable, as well as some debugging output. These tests are not comprehensive, so you should test your implementation yourself as well.
  • To run the provided test suite, set PRINT_ONLY variable in the "Test code" section to False and click the "Test my code" button.
  • To print the example table (created in make_test_table), set the PRINT_ONLY variable in the "Test code" section to True and click "Test my code".

Example Screenshots:

My Cool Data

Col A | Col B | A + B
1 | 4 | 5
2 | 5 | 7
3 | 6 | 9

Summary:
3 rows - 5 columns
Max of Col A: 3
Min of A + B: 5

这道题要求你实现一个简化版的 SigmaTable:它不直接存储仓库数据,而是只记录“如何计算”的操作,例如基于已有列创建新列、以及对列做 Min/Max 这类汇总。核心难点在于所有结果都必须保持对底层数据的动态响应,也就是当原始表数据变化时,派生列和 summary 也要随之更新,因此不能把计算结果缓存成静态值。实现时通常需要用元数据记录列之间的依赖关系,在取列数据时递归计算;对于缺失值按 0 处理,而引用不存在的列则要抛出 ValueError。

正文完
 0