The link to the problem: https://leetcode.com/problems/longest-absolute-file-path/
(Also the same question in CodeSignal — longestPath)
Suppose we have a file system that stores both files and directories.
It looks like this (with ⟶ representing the tab character):
dir
⟶ subdir1
⟶ ⟶ file1.ext
⟶ ⟶ subsubdir1
⟶ subdir2
⟶ ⟶ subsubdir2
⟶ ⟶ ⟶ file2.ext
If we were to write this representation in code, it will look like this: "dir\n\tsubdir1\n\t\tfile1.ext\n\t\tsubsubdir1\n\tsubdir2\n\t\tsubsubdir2\n\t\t\tfile2.ext"
. Note that the '\n'
and '\t'
are the new-line and tab characters.
Every file and directory has a unique absolute path in the file system, which is the order of directories that must be opened to reach the file/directory itself, all concatenated by '/'s
. Using the above example, the absolute path to file2.ext
is "dir/subdir2/subsubdir2/file2.ext"
. Each directory name consists of letters, digits, and/or spaces. Each file name is of the form name.extension
, where name
and extension
consist of letters, digits, and/or spaces.
Given a string input
representing the file system in the explained format, return the length of the longest absolute path to a file in the abstracted file system. If there is no file in the system, return 0
.
Thoughts:
For me, I think the most difficult part of this problem is to understand the problem description. If you take a deep look at the example input "dir\n\tsubdir1\n\t\tfile1.ext\n\t\tsubsubdir1\n\tsubdir2\n\t\tsubsubdir2\n\t\t\tfile2.ext"
, we can find some tips that can help us to solve the problem:
- The input is formed by folder names and file names which are all separated by new-line character ‘\n’.
- The number of the tab character ‘\t’ represents the level of the folder or file from the root directory. For example, we can see that there are 2 tabs before file1.ext, this means that we need to open two folders if we want to reach the file.
- The goal of this problem is to find the longest absolute path to a file. Hence, it is important to know how to get the absolute path lengths every time when we find a file.
So our first question is, how can we get the absolute path of a certain file?
According to our first fact, we can simply get all the folder and file names by using the string.split() method and put all the names into a string array.
However, what are we going to do when we see a file? (i.e., when we see a name that contains a dot.) The ideal situation is that we can immediately know “all the levels of the folders before this file”, then we can get the absolute path of this file. Or even better, if we always record the absolute path lengths of the previous folders, we can just add the current file length to its previous level folder’s length to get the length of the current absolute path.
Then our next question is, which data structure are we going to use to store the length of the folders. Let’s first take a look at this input again:"dir\n\tsubdir1\n\t\tfile1.ext\n\t\tsubsubdir1\n\tsubdir2\n\t\tsubsubdir2\n\t\t\tfile2.ext"
:
dir
⟶ subdir1
⟶ ⟶ file1.ext
⟶ ⟶ subsubdir1
⟶ subdir2
⟶ ⟶ subsubdir2
⟶ ⟶ ⟶ file2.ext
If we consider the root folder as level 0, then we have two level-1 folders, two level-2 folders, one level-2 file, and one level-3 file in the above example. (We can determine their level by the number of tabs.)
Hence, if we want to know the absolute path of a level-2 file, we need to know the absolute length of the level-1 folder; if we want to know the absolute length of the level-1 folder, we need to know the length of the level-0 folder. Hence, that will be great if we can have a container that provides us the length of the absolute path of the most recent folder. A stack will be a nice choice since its concept is LAST IN FIRST OUT.
Code Logic:
Hence, we will have a stack that records the lengths of the absolute path of the folders. The size of the stack should always the same as the current level number. We will remove the data if we found that there are too many “unnecessary lengths” in the stack. We only allow one length of each different level.
For example, if we see a level-2 folder, we will add its absolute path into the stack; if we again see a level-2 folder right after it, we will first pop up the previous level-2 folder then add our current level-2 folder.
Several important variables we will use in the code:
- splits: a string array that stores all the names (folders or files)
- stack: a stack that stores the absolute path length of each folder. When we store the length, we will always assume there will be a slash right after our current folder. For example, for our root folder “test”, we will store the length of “test/” instead of only “test”. Since folders will definitely need a / to contain other folders or files.
- maxLen: record the maximum absolute path length we have found so far.
- level: the level of the current folder or file name. We will determine its level by the number of tabs. Since \t in Java only represents the length of 1, we can find the last index of the tab symbol to determine the number of tabs. (We plus one since Java uses zero-based indexing)
- curLen: the actual length of the current folder or file name. It is important not to include those tab lengths into our absolute path length.
- preLen: the absolute path right before this folder or file. It will be 0 if we are at the root directory.
Below is the 3ms code that beats 23% —
public int lengthLongestPath(String input) {
int maxLen = 0;
String[] splits = input.split("\n");
Stack<Integer> stack = new Stack<>();
for(String cur: splits)
{
int level = cur.lastIndexOf("\t")+1;
int curLen = cur.length() - level;
while(stack.size() > level)
stack.pop();
int preLen = level==0? 0: stack.peek();
if(cur.contains("."))
maxLen = Math.max(maxLen, preLen+curLen);
else
stack.push(preLen+curLen+1);
}
return maxLen;
}
Instead of using the data structure Stack in Java, we can also just use an array and do exactly the same thing. We use an integer array to record the absolute length of each level. In this way, we don’t need to pop up the “old length” anymore, we will just override the length if we see 2 different folders on the same level.
Hence, we can get the below 0ms code that beats 100% —
public int lengthLongestPath(String input) {
int maxLen = 0;
String[] splits = input.split("\n");
int[] stack = new int[splits.length+1];
for(String cur: splits)
{
int level = cur.lastIndexOf("\t")+1;
int curLen = cur.length() - level;
int preLen = level==0? 0: stack[level-1];
if(cur.contains("."))
maxLen = Math.max(preLen+curLen, maxLen);
else
stack[level] = preLen + curLen + 1;
}
return maxLen;
}