NOTICE: Citizendium is still being set up on its newer server, treat as a beta for now; please see here for more.
Citizendium - a community developing a quality comprehensive compendium of knowledge, online and free. Click here to join and contribute—free
CZ thanks our previous donors. Donate here. Treasurer's Financial Report -- Thanks to our content contributors. --

CZ:Bot status/Fix-double-redirects/Community Input

From Citizendium, the Citizens' Compendium
Jump to: navigation, search

This one appears to require deletion rights. I'm not sure we are ready for that :) D. Matt Innis 01:52, 15 January 2010 (UTC)

No, it does not. If you take a look at the test edit, you will see that James Jones (disambiguation)/Definition has not been touched. --Daniel Mietchen 02:35, 15 January 2010 (UTC)

The link that you provided concerning the script states:

Script which fixes double redirects, and deletes broken redirects.

Requires access to MediaWiki's maintenance pages or to a XML dump file. Delete function requires adminship.

I now realize that it refers to the "broken redirects" function that this script also performs, right? D. Matt Innis 03:09, 15 January 2010 (UTC)

Yes, it's the same script, but differently parametrized, which entails different prerequisites, and so I filed them differently. Perhaps we should rename the template's "Botname" in "Task" or some such. --Daniel Mietchen 09:11, 15 January 2010 (UTC)

Thanks for separating the two functions. I think we can run this one, but I have some reservations about the other one that I think we need to work through. I think the name was fine on this one, but the documentation could have been a little clearer to suggest that we were ony goign to use the "double" switch with it. Although, what's to keep someone from running it with the "broken" switch on. Can we lock the page that has the code when we make these things?

I assume that this bot, unlike the last one, will be run occasionally so it needs a permanent bot rather than just using the Housekeeping bot. It would be nice to have a place to "store" these. Have you got a plan for that? D. Matt Innis 12:48, 15 January 2010 (UTC)

Is this worth the effort? Presently, there are eight double redirects. These can be easily fixed manually. And moves can fix redirects when applied. --Peter Schmitt 22:17, 15 January 2010 (UTC)
I think it is worth the one-time effort to set up the automated fix, especially since we are still finding our way towards a CZ:Bot policy, in which these examples help. Even though it will have to be run on a regular basis, I do not see why Housekeeping Bot could not do it — it is clearly a housekeeping task, and it is less than 500 edits per month. In terms of improving documentation, the changes to the template that I alluded to in a recent comment include presenting the code, which would have clearly differentiated between python redirect.py double and python redirect.py broken and also between the test commands (which have a different summary and do not use the "-always" flag) and the real runs. The problem with this is that the code often contains special characters which interfere with the operation of a template, even within a <code></code> environment. --Daniel Mietchen 22:53, 15 January 2010 (UTC)
Thanks for chiming in, Peter. I thought the same thing initially about whether it was worth it and finally decided something similar to Daniel's reasoning. We need to practice our bot policy - part of which is to listen to feedback such as yours and give a reasoned answer as well, so here goes: No matter what the cause, double redirects occur and no-one is aware that they did unless someone manually looks through the double redirects page. Currently, the number is few and could easily be managed manually, I think. I would assume that the first redirect just needs to be changed to point to the third redirect. Who does this? Obviously, broken redirects are more of a problem and probably occur more often so our next step is to decide if it is feasible to allow a bot to delete things without the risk of deleting the wron thing. I haven't gotten there, yet, so feel free to comment.
Daniel, I would think that the advantage of a permanent bot for fixing double redirects is that the idea is to make this function easy. How hard is it to change the script that the "Housekeeping bot" and is this time-consuming enough that it would be easier to just manually make the change?
D. Matt Innis 00:49, 16 January 2010 (UTC)
Sure, this bot may be a good test case. A point to consider may be that the bot cannot decide whether the redirect "in the middle" should be deleted or not. --Peter Schmitt 01:59, 16 January 2010 (UTC)
Good point. As a point of policy developement, I think these are the kinds of things that a Mangement Committee can consider part of their purview, though some might argue that this is a content decision that the EC should keep in the loop. One person creating bots just can't think of all the possible pros anc cons. I hate to answer a question with a question, but I think we have to ask, "will a human *know* when to delete it?" When *do* we know to delete that middle link? D. Matt Innis 02:14, 16 January 2010 (UTC)
Well, the human may not "know" and make a mistake, or he may research the case (if in doubt). But a human has the chance to make a rational decision. Of course, checking a list of the bot-made changes is slightly quicker than checking double redirects because there is no need to change a redirect. --Peter Schmitt 11:20, 16 January 2010 (UTC)
I do not think it is ideal that we are constantly mixing arguments about python redirect.py double and python redirect.py broken.
python redirect.py double never deletes any page. As for the fate of the redirect "in the middle", it leaves it to humans. python redirect.py broken always deletes the page containing the broken redirect. Again, no decision-making involved.
I do not understand Matt's comment from 00:49, 16 January 2010 (UTC), but running python redirect.py double on any account does not technically have anything to do with running any other script from the same account (unless server performance is affected, or if the two scripts depend directly on each other), so here is a test edit performed via the bot account. --Daniel Mietchen 13:45, 16 January 2010 (UTC)
Excuse my ignorance (another reason for an experienced bot manager), but it wasn't a comment, it was a question. Is there a difference in time consumption or user difficulty to use the Housekeeping bot (I assume you have to do something with the script that it runs) versus using a Double Redirect bot that has the script or just fixing the redirect manually?
As an aside, can anyone change the script in a bot and run it? If that is the case, then it is going to be important that we limit who will have access to a bot account and make them responsible for its responsible use only under the direction of the EC or MC. D. Matt Innis 14:02, 16 January 2010 (UTC)
Ad (1): The more accounts someone operates, the higher the potential for confusion or mistakes. Other than that, using a separate account for fixing double redirects is only marginally more difficult than performing the same task from an existing account. In any case, I think doing this kind of maintenance by hand is more troublesome, more error-prone and more time-consuming.
Ad (2): Anyone can change the script and run their own copy of it from their account but to change a script that a given Citizen is currently running from his computer, access to that Citizen's account on that computer is required, which is generally restricted to that Citizen, optionally some (other) admin of their system, and of course the occasional hacker. In order to run the changed script (or any other) via that Citizen's CZ account (or a CZ bot account), the corresponding login information are also necessary. --Daniel Mietchen 14:47, 16 January 2010 (UTC)

(undent)(1) Okay, I thought it would be easier to have an account that just had the script already and just sign in and run it, but if it is easier to just use the Housekeeping bot, then that's what we want.

So really anyone can run a bot from their own account. I wonder if this is something that we should try ot regulate (make sure that all bots are approved and run from a bot account). Maybe we should allow constables to block unapproved accounts that are using a bot... (that's management committee stuff, though). That's also why we had to remove move and delete rights from author accounts; we had someone that created a move and delete bot from their account and raised havoc as a vandal for a couple weeks.

Anyway, I think I'm ready to run this one, are you? D. Matt Innis 20:18, 16 January 2010 (UTC)

Will there be an easily available list of fixed double-redirects that can be used to check the middle ones, and where one can delete or mark those checked? (The log of the bot should not be used for this purpose.) Reason: Such double redirects may have been generated by correcting a move to wrong name -- such a middle redirect should not stay. --Peter Schmitt 21:47, 16 January 2010 (UTC)
OK, I just ran it and will arrange to repeat this on a regular basis. --Daniel Mietchen 21:52, 16 January 2010 (UTC)
Peter, the best way to get such a list is to look at the bot's contributions. --Daniel Mietchen 21:57, 16 January 2010 (UTC)

(EC)I think Peter has a legitimate question. It seems that the contrib list from the bot should be something that we can document somewhere. Can you handle that, Daniel?

Also, let's publish a schedule of when these are going to be run and by whom. I can imagine some bots need permission before it's run each time. D. Matt Innis 21:59, 16 January 2010 (UTC)

If you tell me what kind of documentation beyond the bot's list of contribution you need, then I can comment on the feasibility. As for scheduling, I would strongly suggest not to get too precise here, since volunteers can not generally be expected to adhere strictly to any such plan. Perhaps it's better to phrase it in terms of "is recommended to be run once a week or if more then ten double redirects have accumulated", or some such. With respect to your question from 20:18, 16 January 2010 (UTC), I think blocking users who run unapproved bots is the way to go, though I would add that exceptions may be sensible for minor script-based changes for which full approval would be too much to ask. In such cases, hopefully, a brief check by one of the bot managers could substitute for formal approval. Taking away move and delete rights from Authors seems to make sense, but perhaps giving this right to more Editors than now may be a good move. --Daniel Mietchen 22:36, 16 January 2010 (UTC)
Just a link to the contribs list is all I need. We can't expect an user to know where to look, so let's list them.
I'm not ready to let anyone run a bot without showing they have satisfied our bot policy so I don't see letting anyone run one without approval here. Besides, we can't block one person and not another; it's too much to ask a constable to try and figure out who is running what. That reminds me. We need to require that the constables are aware that a bot is running, so we need to send an email to constables at citizendium dot com beforehand as well. The important thing to identify is how long it is expected to run and if the author will be available to turn it off or is the constable going to have to keep an eye out. D. Matt Innis 23:09, 16 January 2010 (UTC)
Perhaps I should explain in more detail why I think that a separate list (could also be a category) of changed redirects would be useful: If you resolve a double redirect there are two redirects (to the same page) of which (probably) one is useless (maybe even both). If the double redirect were repaired manually that would be an opportunity to clean up, as well. If the bot does it, then they can be "found" in the log created by the bot, but this log is not the place where redirects which have been checked (and cleaned up) should be deleted. The separate list is useful for bookkeeping. --Peter Schmitt 00:21, 18 January 2010 (UTC)
Sorry, Peter, I still do not get what precisely you are aiming at. If you go to the bot's contributions, that list will contain links to diffs of every change made, and from the diff, you can easily navigate to both original redirects. Is it correct to say that you want a list of these that saves a number of clicks along this way? --Daniel Mietchen 21:12, 18 January 2010 (UTC)
Matt, I agree that we should apply the bot policy consistently. Nonetheless, I think that some small-volume changes like these, done with this command should not require the full formal approval procedure if performed by someone who has already shown a certain level of bot literacy in a well-behaved way. I also do not see the need to keep bot accounts blocked as per default, as long as we operate on the principle of trust, and doing so does not make sense at all if we have approved regular bots such as the one to which this thread is dedicated. By the way, I noticed that your recent revisions to {{BotReq2}} damaged the formatting of CZ:Bot status. I have some ideas on what further information to include in future approval requests and will do some work on those templates before filing the next request. --Daniel Mietchen 21:42, 18 January 2010 (UTC)

(unindent)
Daniel, at the moment the list of the bot's contributions has the 8 double-redirects on top (and I know that there were).

  • But after some time, and if one does not know the number, it will be difficult to find them in the list.
  • Moreover, if a Citizen checks one of these changes, there is no way to mark this entry as "checked", so no one will know if there are checks necessary.
  • (A minor point: The list does not give the "middle" redirect -- it has to be found by checking the previous revision.)

And checking the double-redirects is useful:
One of the redirects concerned the talk page of Android. It was created (by me, I overlooked it) when creating a disambiguation page. Thus both redirects should have been deleted, not fixed. In course of checking this I also noticed redirected /Related Articles and /Definition subpages which also are not needed.
I think that this example proves that manual checking is useful and should be done. Moreover, I suspect that this example is quite typical for the way how double redirects are created. (I shall later check the others, too.) Checking can only be done conveniently either from the Special page listing double-redirects, or from a bot created category of redirects (or list) to be checked because then it is possible to remove them after the job is done. (Thus I am not sure that this bot really saves time.)

Daniel, frankly, I am more concerned that you aren't concerned that there isn't some precaution taken on every bot written by every person, every time. Of course you trust yourself, but everyone is fallable, including me. None of the requests are overburdening, they only require that the bot is properly tested and that the wiki can be easily fixed if something goes wrong, which it will one day. Also, as Peter illustrates above, what may seem easy to you, may actually be more work for those that have to go behind and clean up the little things that the bot causes. This is the kind of thing that upsets editors and authors and they have every right to be upset. Once we've streamlined this process, an experienced bot user won't have a problem working through the process in less than a day. Certainly the (all each) example above would be easy to pass through the process. I don't know enough to trust anyone to run bots freely from the Housekeeping bot and as I am asked to keep the wiki safe, the default of my actions has to be to err on the side of caution. Once we've got the charter in effect and some thoughtful process occurs that involves more than just you and me, then I will do as they determine is reasonable. Until then I can't leave the Housekeeping bot unblocked and I would have to block a bot that ran without running through the process.
This is a wiki, Daniel. Describing my changes as damage is a little dramatic, don't you think. The templates are serving my purposes really well at this point, but I always look forward to improvements. Nevertheless, maybe now you understand how editors feel when a bot blows through all their articles and makes a change that they don't understand. D. Matt Innis 03:13, 19 January 2010 (UTC)
Just for completness: Further 5 of the double-redirects (James Jones, etc.) turned out to be just as "Android". One more (TEMPEST) was similar, but a little more complicated, and only 1 may be useful (if you want redirects "Paleoconservative" and "Paleoconservatism" parallel to each other. --Peter Schmitt 20:13, 19 January 2010 (UTC)
Ad Matt, I am well aware that infallibility is never guaranteed, and that automated actions may have more severe side effects than manual ones. I still do not see, however, why the Housekeeping Bot has to be blocked by default. Only the Constables and me have the account information for this bot, so we are certainly not talking about "anyone" running anything through that account. And since we agree that unauthorized bot actions from any account may lead to blocking of that account (e.g. my personal one), it is just a matter of defining what "unauthorized" actually entails (obviously, some testing must be allowed if it is required for approval, and there may be other things to consider). This is the process we are currently in, and if we end up with a situation in which things like the "each all" case can pass through swiftly, I shall be content.
Ad your second point: Of course it's a wiki, and I always welcome improvements and attempts on improvements. What I meant by "your recent revisions to {{BotReq2}}" having "damaged the formatting of CZ:Bot status" is that none of the information currently piped into the {{BotReq2}} template at CZ:Bot status is actually displayed via the template. Please excuse me if "damage" is not the right word to describe this situation. Besides, all my previous bot actions have always been listed at this dedicated page, and none of them can be said to have "blown" through anyone's (shouldn't we abandon that notion anyway?) articles, since most of the time, the page had never been seen by anyone before, and the only automated action affecting an existing cluster's main page (except for this single test edit of another standard script) was to remove Wikipedia's "Fact" template, which cannot be said to be controversial.
Ad Peter, yes, a real clean-up does involve checking the appropriateness of the redirects, which the bot can't judge. However, I have never witnessed anyone here looking at the matter as closely as you did on this occasion, so for most of the usual cleanup of redirects, the bot would be an appropriate replacement. I will not insist on running it and would be fine with putting it under "Postponed" for, say, three months, to give us some more time to consider the matter.
--Daniel Mietchen 23:00, 19 January 2010 (UTC)
Forgot to mention that the bot can be run in supervised mode, i.e. with each edit having to be confirmed by the operator. This way, control can be exerted over the suitability of the bot-proposed edits, and entries can be skipped. Will give this a demo now. --Daniel Mietchen 13:50, 28 February 2010 (UTC)
The edits of the demo are here. I skipped "National Affairs (Edit) →‎ National Affairs (magazine) →‎ National Interest (magazine)" and "The Accidental Guerilla (Edit) →‎ The Accidental Guerrilla →‎ David Kilcullen". --Daniel Mietchen 13:57, 28 February 2010 (UTC)