Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instagram ripper removes trailing _ from usernames #1182

Closed
cyian-1756 opened this issue Jan 21, 2019 · 2 comments
Closed

instagram ripper removes trailing _ from usernames #1182

cyian-1756 opened this issue Jan 21, 2019 · 2 comments

Comments

@cyian-1756
Copy link
Collaborator

  • Ripme version: 1.7.76
  • Java version:
  • Operating system:

Expected Behavior

Rips the user and saves the pictures to a folder called indithew_

Actual Behavior

Strips the trailing _ and saves the images to a folder called indithew

This bug was report by ManOfSin666 on reddit here

@Gamerick
Copy link
Contributor

Gamerick commented Feb 6, 2019

Right. Found the location of the issue and I'm going to need some input from others before submitting a pull request as this isn't an instagram specific issue.

The issue itself lies with the implementation of the Utils.fileSystemSafe() method which explicitly removes trailing underscores from file paths at the final replaceAll() of this line:

text = text.replaceAll("[^a-zA-Z0-9.-]", "_") .replaceAll("__", "_") .replaceAll("_+$", "");

I'm not savvy enough with file paths to know if this behaviour is necessary. If it isn't then simply removing that line will work. Otherwise we'd need to change that implementation to something that appends a safe, non-intrusive character rather than simply deleting it altogether.

Also, looking at the attached reddit thread it would appear that usernames may have leading underscores as well, which would in this case be removed by the second replaceAll() in the block above. i.e. a user with username "jeremy" would produce an unsanitised folder name "instagram__jeremy_" would become instagram_jeremy.

My proposal is to simply prepend and append a character to usernames in this function from the AbstractRipper Class

public String getAlbumTitle(URL url) throws MalformedURLException { return getHost() + "_" + getGID(url); }

Like so

public String getAlbumTitle(URL url) throws MalformedURLException { return getHost() + "_" + "*" + getGID(url) + "*"; }

With the star being a placeholder for now. That or we leave it to the specific rippers to overload said method. If the latter, I'll go ahead and make the specific implementation for instagram

@cyian-1756
Copy link
Collaborator Author

cyian-1756 commented Feb 9, 2019

I'm not savvy enough with file paths to know if this behaviour is necessary

I'm pretty sure I wrote that code to deal with rippers that didn't bother dealing with trailing _s

If it isn't then simply removing that line will work

Just removing the .replaceAll("_+$", "") should work

i.e. a user with username "_jeremy_" would produce an unsanitised folder name "instagram__jeremy_" would become instagram_jeremy.

In that case we should pull that statement as well.

My proposal is to simply prepend and append a character to usernames in this function from the AbstractRipper Class

It's possible but what could we use? The only safe chars to use is a-zA-Z._-

"*" + getGID(url) + "*"

* isn't file system safe.

I think this would be better solved by implementing the meta data feature that was talked about here. It be trivial to store the full unchanged username in a json file and we wouldn't have to worry about filesystem safety

@soloturn soloturn closed this as completed Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants