Leetcode: 1948. Delete Duplicate Folders in System
- Mistake: Incorrect evaluation of solution's viability: Coded a more complex and wrong solution using hashes to avoid time limit when the simple solution was enough.
Can we simplify the problem while keeping it the same? Imagine that we can assign an id for each subtree where different subtrees have different numbers and equivalent subtrees have the same number. The problem becomes filtering all subtrees with duplicated id. For this problem, the id can be a serialization of the subtree like one generated by a pre-order traversing of the tree. Time complexity is \(O(n)\) where \(n\) is the number of nodes in the tree.
from typing import List class Solution: def deleteDuplicateFolder(self, paths: List[List[str]]) -> List[List[str]]: root = {} for path in paths: node = root for p in path: node = node.setdefault(p, {}) code = {} def encode(node): if not node: return "null" code[id(node)] = ",".join( f"({child},{encode(node[child])})" for child in sorted(node) ) return code[id(node)] encode(root) freq = Counter(code.values()) ans = [] def dfs(node, path): if not node: ans.append(path[::]) return if freq[code[id(node)]] > 1: return ans.append(path[::]) for child in sorted(node): path.append(child) dfs(node[child], path) path.pop() dfs(root, []) return ans[1:] assert Solution().deleteDuplicateFolder( [["a"], ["c"], ["d"], ["a", "b"], ["c", "b"], ["d", "a"]] ) == [["d"], ["d", "a"]] assert Solution().deleteDuplicateFolder( [ ["a"], ["c"], ["a", "b"], ["c", "b"], ["a", "b", "x"], ["a", "b", "x", "y"], ["w"], ["w", "y"], ] ) == [["a"], ["a", "b"], ["c"], ["c", "b"]] assert Solution().deleteDuplicateFolder([["a", "b"], ["c", "d"], ["c"], ["a"]]) == [ ["a"], ["a", "b"], ["c"], ["c", "d"], ]