-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turkish problem #56
Comments
We need someone who knows German here probably, in my opinion |
Seems like German transliteration are fine according to this thread http://forum.grabaperch.com/forum/08-29-2014-encoding-of-german-umlaute-in-slugsfilenames |
At least right now you can add\override transliteration rules to fit your needs without waiting for new release like so: $slugify->addRule('ö', 'o');
$slugify->addRule('ü', 'u'); actually I would prefer create new class that extends slugify and implements its interface and add this inside its constructor - so you can change/add custom rules whenever you need, and you always free to change slugify to anything else |
I'm speaking German and it's correct to transliterated ü to ue. We could probably add a Turkish ruleset. 🐯
|
Rules for Ä and Ö are wrong in Finnish, too. They should be fall back to A and O. I'd imagine same thing with Swedish and other nordic languages. AddRule method is a good workaround for now! |
Here is my proposal to fix this and all other such issues in future:
What do you guys thing about this? It will be nice if you could provide some rules specific to languages you are know so we can make pull request with this proposal implemented |
Ok, here are some of my thoughts on this issue:
As I see it we have two possibilities. Make the smallest changes possible to make this work for a new 1.x release or rethink how we could do this and make a 2.0 release. With the latter we could also try to tackle stuff like Persian. |
Lets even cal it User can use slugify in following scenarios:
In all cases rulesets do the work, and it is great because rulesets allow us to not copy paste every letter for every language and "tune" transliteration to what user wants In all cases we are not touching interface so should not break any existing code so probably there is no need for v2 |
Sounds reasonable. Do you have any ideas on the actual implementation. As mentioned, currently there is one big array and if a ruleset is activated the rules of that set are copied into that big rules array. If we have a ruleset for every language we need to copy all the rules into the big rules array (and make sure that the given language/locale/ruleset is copied last (that is, it takes preference). The other possible implementation (without making bigger changes to the code) is to iterate through the rulesets in each Both will affect performance. |
Here is what I'm thingking about:
Something like that i think can fix thous issues and all future ones Actually we can even get rid of constructor param, so after instantiating slugify user will call |
I don't think it as easy as that. First of all I think we should define what is the "default" case. For example, for a German speaker However, I think we should only move the conflicting transliterations into a ruleset. For example, Currently I am unsure if this change would be breaking compatibility. Since the very beginning of Slugify the German transliterations have also been the default and changing the default would break existing software. |
Oh, I see your point now - it seems that German is used much more that Turkish, Finnish etc together so German transliteration rules should take precedence. But on the other hand - if we take German rules as default one - then we need to provide many rulesets for languages like Finnish, Turkish etc that will duplicate each other. Ok, so I'm stuck now :) At this moment it seems that the only way for people who do not like |
Maybe we could define a |
Here is what comes in my mind: $slugify->activateRuleset('no-umlaut'); will solve problem and we definitely can do it, but I would still recommend to use: $slugify->addRules(['ä' => 'a', 'Ö' => 'OE']); for following reasons:
|
I honestly feel a library like this should be focused on how slugified characters look, not how they sound. Downcoded characters look correct in every language as opposed to transliterated ones. Rather than assuming that everyone who uses the library is german by default, the transliteration could perhaps be explicitly enabled on per-language basis? Just my two cents. |
This should be solved by #81 |
Hey,
Seems like some Turkish letters fall into German:
Türkiye Kupasi => tuerkiye-kupasi
While it should be turkiye-kupasi. Same problem with ö which should fall into o.
Not sure, but it seems Turkish should go to a ruleset.
The text was updated successfully, but these errors were encountered: