Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accepting user input to quit less? #113

Open
rdauria opened this issue May 18, 2021 · 19 comments
Open

accepting user input to quit less? #113

rdauria opened this issue May 18, 2021 · 19 comments

Comments

@rdauria
Copy link

rdauria commented May 18, 2021

Hello,

I have noticed that when running the bash kernel on a jupyter notebook opened on a centos 7.9.2009 box the cell hangs indefinitely and does not quit even when attempting to restart the kernel. The behavior is different on a Centos 6.10 box. Is there a way to give user input to quit less from a bash kernel? Notice that this problem occurs even when running "module av" since the latter relays on less to display content. It seems to me that this issue is perhaps related to #111?

To add contest: on the CentOS 7.9.2009 box I have python version 3.7.3, ipykernel==5.3.4 and bash-kernel==0.7.2. While on the Centos 6.10I have python version 3.7.2, ipykernel==5.1.0 and bash-kernel==0.7.2.

Any idea on how to get around this?

Thanks,

RD

@takluyver
Copy link
Owner

No, there's not really any way round this. It's a longstanding limitation (e.g. #60, #63, #83): the Jupyter kernel protocol expects that the kernel will actively request input it needs while executing a cell. But input in the terminal is pushed to stdin, for a process to read if it's interested. As far as I know, there's no simple, general way to tell that a process is waiting for input.

@multimeric
Copy link

multimeric commented Nov 1, 2022

Could we have some kind of "magic" command like the ipython kernel has, to indicate that we want to treat a command as interactive? e.g.

%interactive
vim

Then the bash kernel could trigger the Jupyter API that asks for user input?

@multimeric
Copy link

multimeric commented Nov 2, 2022

I'm willing to contribute this feature.

I had a look at some alternatives. We can't use cell tags or metadata to denote cells as being interactive, because cell metadata isn't sent with execution requests. "Cell magic" aka a %% command seems like our best bet.

Also, since Jupyter already has the stdin channel, and all frontends know how to handle it, I think this is the best way of providing an interface for interactive inputs.

It would be a bit ugly because I don't think the stdin channel is designed for control characters like Ctrl+D (not sure though), but I can't think of anything better that works with the Jupyter protocol.

@abbbe
Copy link
Contributor

abbbe commented Nov 2, 2023

I know today is a one year anniversary of the last comment in this thread, but can't find any better place.

It occurred to me that there might be a relatively simple way to add interactivity to bash_kernel by running bash process inside of some terminal multiplexor.

There is tmux (quite feature rich, not sure it will play well with pexpect), more lightweight dtach (can be installed from ubuntu packages), and even more lightweight abduco. I only played with dtach a bit.

For instance, instead of spawning 'bash' we could spawn "dtach -c /tmp/foozle -Ez -r none bash" (here name of /tmp/foozle have to be a dynamically generated path, per live bash_kernel instance). As far as pexpect is concerned, (I think) it will have pretty much the same behavior as a pure bash. But if user needs to enter a sudo password or quit less, they could just do 'dtach -a /tmp/foozle' and send keyboard commands. Then disconnect with "Ctrl-".

dtach does not keep any history, so when user attaches, they do not see a password prompt of sudo, for instance. I suppose one would have to go for tmux if a history is required.

@takluyver
Copy link
Owner

I think that could technically work 🙂 but it's an awkward workaround at best - if you're stuck in something like less, you'd need to open a terminal on the same node, find the socket file the relevant bash_kernel is using, and run some unfamiliar commands.

On the implementation side, I think it would be relatively easy to support something like dtach or abduco, but these are less common in the wild (the cluster at my work has tmux & screen, but not dtach or abduco) and neither seems to be actively maintained. Supporting something like screen/tmux is probably more complicated, because they keep state based on a notion of a fixed-height updateable terminal, which isn't a good fit with Jupyter.

Then bash terminal would either need to automatically use one of these dtach-like programs if it's found, which feels kinda wrong, or it would need some kind of config file, which so far we've managed to avoid. This is all possible, but I'm not convinced the benefits outweigh the added complexity.


The best I can think of just now is to make some kind of wrapper program/function which ran a subcommand in a secondary pty:

make_interactive 'less file.txt'
bash_kernel <---> [pty] <---> make_interactive (run from bash) <---> [pty] <---> less

Then make_interactive would send some sort of marker to indicate to bash_kernel that it was waiting for input (I don't think bash_kernel has anything for this already, so it would have to be added), to present the input prompt in the frontend, and forward any input it got to its child pty.

But this is still really janky:

  • Any program that draws 'full screen' in the terminal (like less or vim) isn't going to play well with the notebook user interface.
  • I don't think the Jupyter UI gives you any way to send non-printable keystrokes, like arrow keys, Ctrl-C, Ctrl-D, which you often want for controlling these programs.
  • Even for printable characters, nothing is sent to the kernel until you press enter, so it's q, enter instead of q to quit less.
  • I don't think the kernel can cancel an input request, so you might get an extra input box after you quit the program.
  • It doesn't integrate well with the shell, e.g. using shell variables or piping input/output is going to be awkward at best.

So I'm not enthusiastic about this, and I can't think of any better way to do it 🤷. Interactive terminal programs just don't fit well with Jupyter. @kdm9 is the main maintainer of bash_kernel now, so it's up to him whether a solution along those lines is 'good enough'.

@abbbe
Copy link
Contributor

abbbe commented Nov 3, 2023

it's an awkward workaround at best - if you're stuck in something like less, you'd need to open a terminal on
the same node, find the socket file the relevant bash_kernel is using, and run some unfamiliar commands.

Personally, I use jupyterlab+bash_kernel to document my pen testing activities, I feel bash_kernel is real enabler here, pity most people do not know about it. It makes my workflow so much more convenient and self-documenting. But sometimes I stumble across lack of interactivity and it is quite a nuisance... Afaik, even an ugly workaround would be a relief...

Then bash terminal would either need to automatically use one of these dtach-like programs if it's found,
which feels kinda wrong, or it would need some kind of config file, which so far we've managed to avoid.

Do I understand correctly nothing really forces to spawn bash process the moment kernel starts? Could wait to receive first code cell, which could contain %magic config statements. Do you see any problem with this approach?

This is all possible, but I'm not convinced the benefits outweigh the added complexity.

In my humble opinion (supported by #60, #63, #83, #104, ...) benefits do exist. The question is if there is not too janky way to implement it :).

Supporting something like screen/tmux is probably more complicated, because they keep state based on a
notion of a fixed-height updateable terminal, which isn't a good fit with Jupyter.

Yes, but I have a feeling there is a way to make bash_kernel work with tmux smoothly, but can't quite put my finger on it...

So far we assume
bash_kernel -> pexpect -> tmux -> bash
Theoretically, we could do
bash_kernel -> tmux -> bash_kernel_pexpect_helper -> bash

Interactive terminal programs just don't fit well with Jupyter.

Well, I am not convinced this is set in stone... And the output cell could contain JS - which means sky is the limit. And there is Terminal feature in Jupiter Lab and even noVNC lab extension.

@takluyver
Copy link
Owner

Do I understand correctly nothing really forces to spawn bash process the moment kernel starts? Could wait to receive first code cell, which could contain %magic config statements. Do you see any problem with this approach?

That's possible, just as adding a config file is possible. It's just extra complexity. I suspect the 'special first cell' approach would also confuse some users (why can't I reconfigure this later?).

In my humble opinion (supported by #60, #63, #83, #104, ...) benefits do exist. The question is if there is not too janky way to implement it :).

Oh, there are definitely benefits if a good solution to this is possible. But every idea I've seen so far is really janky - not just to implement, but for the user as well.

Here's yet another idea: I notice the first 3 issues you linked are all prompt-type programs. If we limit the scope to those, i.e. leaving out full-screen programs like vim, we could make a wrapper something like this:

bk-interact -p 'Username:' -p 'Password:' -- git pull

bk-interact -p 'password for \w+:' -- sudo blah

The wrapper would run git/sudo/docker/etc. as a subprocess, look for the specified patterns in its output, and replace them with a special marker to tell bash_kernel to prompt the user for input (which would still need to be added to bash_kernel). That's still janky because you have to think about this in advance and it's different from a normal terminal session, but maybe it's useful enough.

You could even make it guess when input is wanted based on pauses and general patterns (like : at the end of output), so you don't need to specify patterns in common cases. But that gets hacky fast.

the output cell could contain JS - which means sky is the limit

This is up to @kdm9 now, but if I was still maintaining it, I'd definitely reject any solution that relies on Javascript output. It's not portable to different contexts (like nbconvert), even if you only care about one Jupyter frontend it's more likely to break with frontend changes, and it's an order of magnitude harder to test.

This is not specifically a bash kernel issue - you can do !sudo blah in a notebook with the default IPython kernel and get stuck in exactly the same way. And it does crop up from time to time, e.g.: ipython/ipython#10975, ipython/ipython#10499 and ipython/ipykernel#304. bash_kernel just makes it easier to hit since we're so used to running interactive programs from bash. But any good solution, if it's possible at all, probably needs to involve Jupyter, not just bash_kernel.

@abbbe
Copy link
Contributor

abbbe commented Nov 6, 2023

I was sure I answered last week, but somehow my long comment was lost it seems... Will keep it short :).

The wrapper would run git/sudo/docker/etc. as a subprocess, look for the specified patterns in its output, and replace them with a special marker to tell bash_kernel to prompt the user for input (which would still need to be added to bash_kernel). That's still janky because you have to think about this in advance and it's different from a normal terminal session, but maybe it's useful enough.

I wonder what do you have in mind for "to prompt the user for input (which would still need to be added to bash_kernel)"? Unless you want users to interact with bash_kernel process directly it has to go through Jupiter Lab UI which has no support for prompting, if I am not mistaken. Unless you want to consider each prompt as a separate cell or something.

@takluyver
Copy link
Owner

Jupyter does have support for making an input prompt while a cell runs - to see what I mean, in a Python notebook, do something like a = input('> '). But the kernel has to ask the frontend to make this prompt, and the root of this issue is that bash_kernel doesn't know when something is waiting for a line of input.

@kdm9
Copy link
Collaborator

kdm9 commented Nov 6, 2023

Do I understand correctly nothing really forces to spawn bash process the moment kernel starts? Could wait to receive first code cell, which could contain %magic config statements. Do you see any problem with this approach?

That's possible, just as adding a config file is possible. It's just extra complexity. I suspect the 'special first cell' approach would also confuse some users (why can't I reconfigure this later?).

Agreed, not super happy with this solution

In my humble opinion (supported by #60, #63, #83, #104, ...) benefits do exist. The question is if there is not too janky way to implement it :).

Oh, there are definitely benefits if a good solution to this is possible. But every idea I've seen so far is really janky - not just to implement, but for the user as well.

Again 100% agree.

Here's yet another idea: I notice the first 3 issues you linked are all prompt-type programs. If we limit the scope to those, i.e. leaving out full-screen programs like vim, we could make a wrapper something like this:

bk-interact -p 'Username:' -p 'Password:' -- git pull

bk-interact -p 'password for \w+:' -- sudo blah

The wrapper would run git/sudo/docker/etc. as a subprocess, look for the specified patterns in its output, and replace them with a special marker to tell bash_kernel to prompt the user for input (which would still need to be added to bash_kernel). That's still janky because you have to think about this in advance and it's different from a normal terminal session, but maybe it's useful enough.

You could even make it guess when input is wanted based on pauses and general patterns (like : at the end of output), so you don't need to specify patterns in common cases. But that gets hacky fast.

Love it, would gladly accept a PR for such code.

the output cell could contain JS - which means sky is the limit

This is up to @kdm9 now, but if I was still maintaining it, I'd definitely reject any solution that relies on Javascript output. It's not portable to different contexts (like nbconvert), even if you only care about one Jupyter frontend it's more likely to break with frontend changes, and it's an order of magnitude harder to test.

Absolutely agree, this should use the "standard" Jupyter-supported prompt infrastructure, like input("> ") in a python notebook would.

This is not specifically a bash kernel issue - you can do !sudo blah in a notebook with the default IPython kernel and get stuck in exactly the same way. And it does crop up from time to time, e.g.: ipython/ipython#10975, ipython/ipython#10499 and ipython/ipykernel#304. bash_kernel just makes it easier to hit since we're so used to running interactive programs from bash. But any good solution, if it's possible at all, probably needs to involve Jupyter, not just bash_kernel.

Sorry but I'm super time limited until the end of this year, so it's unlikely I'll be able to implement this in a timely fashion. If someone does want to do the leg work, i'd be more than happy to review. If not, remind me in the new year and I'll implement @takluyver's suggestion above (an 'expect' like wrapper).

@abbbe
Copy link
Contributor

abbbe commented Nov 9, 2023

Jupyter does have support for making an input prompt while a cell runs - to see what I mean, in a Python notebook, do something like a = input('> '). But the kernel has to ask the frontend to make this prompt, and the root of this issue is that bash_kernel doesn't know when something is waiting for a line of input.

I see. Nice to know :)

@abbbe
Copy link
Contributor

abbbe commented Nov 9, 2023

As far as I know, there's no simple, general way to tell that a process is waiting for input.

I think there is one, shells use it. Try "(sleep 3; cat) &" and press Enter few times with 1s delay:

% (sleep 3; cat) &
[1] 20621
% 
% 
% 
[1]  + suspended (tty input)  ( sleep 3; cat; )
% 

It relies on SIGTTIN & SIGTTOUT, standard POSIX signals.

@abbbe
Copy link
Contributor

abbbe commented Nov 10, 2023

Done a bit more research :). Found an interesting alternative to tmux/screen, it is called termpair, purely python. It can be wrapped around bash (like tmux/screen we've discussed, but absolutely seamlessly). Features web-based terminal emulator to interact with programs without having to engineer stuff like "bk-interact -p 'Username:' -p 'Password:' -- git pull" beforehand.

I have done a quick test, which can be reproduced as following:

Then in JupyterLab UI:

  • open a terminal and launch 'termpair serve' - this will spawn a listener process which will serve web terminal session,
  • launch bash kernel - this will spawn bash as usual, but through script and termpair. (here we need script because termpair will display URL to stdout on startup, and this output is swallowed by bash_kernel, can probably work around this)
  • open another terminal and get URL from /tmp/blah (hardcoded output of script, see above), using "grep Shareable /tmp/blah" for instance
  • open that URL in a browser, click on the terminal but do not type anything
  • go back to UI and send 'sudo id' (assuming password required)
  • go to the browser and see sudo session rendered normally, enter password

This dirty and fragile (especially script part is a mess), but interesting part it allows interactive programs to run under jupyter without any compromises. By the way, script's parameters are tuned to MacOS syntax for unbuffered mode '-t 0' for Linux will probably need to change this to '-f'.

@kdm9
Copy link
Collaborator

kdm9 commented Nov 10, 2023

Impressive, but without meaning any offence, this definitely falls in the category of

" But every idea I've seen so far is really janky - not just to implement, but for the user as well."

:)

@abbbe
Copy link
Contributor

abbbe commented Nov 10, 2023

very difficult to argue with this, but I will try haha)

Sure, if avoiding HTML is your design choice, the best you can do is to rely on input() with "bk-interact" helper or maybe taking advantage of POSIX TTIN/TTOUT signals.

Personally, I fail to see why HTML/JS is no-go. Yes, it not gonna work in 'jupyter console', afaik this is a fair limitation. ipywidgets are not supported there either. The only explanation I can see is you see no real value in a full-blown support of interactive programs, which surely can be a 100% valid design choice - your church, your rules ;).

But if we lift this restriction I think a very user-friendly interface is possible. The moment a program requires input a hyperlink or a button to open web terminal could appear in cell output area. User presses it, interacts with the program, closes it. Cell output retains everything that has happened. Business as usual + interactivity. Implementation-wise this adds another websocket and listener which serves it, obviously a complication, but perhaps it is not too awkward addition to kernel gateway piece.

Myself I keep pushing for this because (1) this is technically challenging issue which is fun to work on haha and (2) I am using notebooks to document my activities during pentests, which require variety of tools including interactive ones. I will tell you something even more blasphemous... some tools I rely on have GUI and I believe there are not technical limitations preventing GUI programs from being launched from bash_kernel, its novnc console being available in a cell output and have screenshots retained in cell output.

Hope my ramblings do not fall into "spam" category. I appreciate your feedback :).

@takluyver
Copy link
Owner

It relies on SIGTTIN & SIGTTOUT, standard POSIX signals.

Interesting idea! I don't think it's easy to do something based on that, because SIGTTIN is sent to background processes when they try to read from the terminal, and the processes we're talking about here are in the foreground. Maybe it's possible to do something where we put all the processes we're interested in into the background and then watch to see if they get stopped, then flip them into the foreground again to read their input...

I'd give it about a 60% chance that it's possible to make something like based on SIGTTIN that can work for a simple demo, but only about a 10% chance that it's possible to make something robust enough to be broadly useful.

Personally, I fail to see why HTML/JS is no-go.

I think at this point I'd say it's a question of different goals. My goal with bash_kernel, which I think @kdm9 shares, is/was to make bash_kernel a 'good citizen' within the Jupyter kernel interface, not to morph the notebook UI into a full-blown terminal emulator or a remote desktop client. I also don't think either of us have the time to maintain that additional complexity.

So, if that is your goal, you might want to fork bash_kernel, come up with a cool name, and use it as the starting point for your own experiments. I would have no resentment about this - part of what's great about open source is that you can take someone's code to explore a different idea. 😃

The only explanation I can see is you see no real value...

I think this kind of absolute-ish language is an obstacle to a constructive discussion, while the rest of your post was quite nice. I do see value in your goals! I've tried to make that clear. I just also see significant drawbacks, and I think for this project, they outweigh the benefits.

@abbbe
Copy link
Contributor

abbbe commented Nov 13, 2023

I'd give it about a 60% chance that it's possible to make something like based on SIGTTIN that can work for a simple demo, but only about a 10% chance that it's possible to make something robust enough to be broadly useful.

I think SIGTTIN is in use by *nix shells since shell background processes were invented, the thing as as robust as it gets and designed exactly for the purpose of letting parent process know child wants to use terminal. However, considering I am not ready to produce PoC code to substantiate my opinion, this is merely a gut feeling :).

I think this kind of absolute-ish language is an obstacle to a constructive discussion

You are right, I should have expressed myself differently. Sincere apologies if I have hurt anyone's feelings.

My goal with bash_kernel, which I think @kdm9 shares, is/was to make bash_kernel a 'good citizen' within the Jupyter kernel interface, not to morph the notebook UI into a full-blown terminal emulator or a remote desktop client.

Sure, makes sense. bash_kernel in its current state is extremely useful already.

So, if that is your goal, you might want to fork bash_kernel, come up with a cool name, and use it as the starting point for your own experiments. I would have no resentment about this - part of what's great about open source is that you can take someone's code to explore a different idea.

Yes, I am considering this :).

@takluyver
Copy link
Owner

I think SIGTTIN is in use by *nix shells since shell background processes were invented, the thing as as robust as it gets and designed exactly for the purpose of letting parent process know child wants to use terminal.

I think it probably works well for background processes in a terminal emulator, but what we've got are foreground processes that aren't really in a terminal emulator. So we'd need to move everything we run into the background (including bash itself? or just its children? 🤔 ). The process trying to read gets SIGTTIN, not us, so we'd have to watch them to see when they're stopped, ask the frontend to prompt for some input, then switch the program into the foreground, write the input to the pty, and put the program back into the background to see if it wants more input. Race condition alert! If the program tries to read before then we background it, does it still get SIGTTIN? 🤷 Conversely, could we background it too quickly and cause it to miss some of its output?

Does this behave the way we want for programs like vim/less that can ~always accept input? Will they write a new screen as output before attempting to read more input? What about programs that set a non-default handler for SIGTTIN? Does anything actually do that?

You definitely get kudos for thinking of SIGTTIN - I've been answering variants of this question for years and haven't spotted that SIGTTIN might help. And it could be a pretty neat solution if the answers to all those things above are the ones we want, and if there are no other issues. I just expect it won't come out quite that neatly; perhaps I'm cynical. 😉

You are right, I should have expressed myself differently. Sincere apologies if I have hurt anyone's feelings.

Thanks, no harm done 🙂

@abbbe
Copy link
Contributor

abbbe commented Nov 13, 2023

It was a long time since I studied this stuff and I don't have time now for proper research, unfortunately. But, for what is worth:

The process trying to read gets SIGTTIN, not us ...

Yes. But a parent can arrange to receive SIGCHLD or monitor a child via waitpid(). One thing is bash_kernel is not a parent, but a grand-parent. Perhaps can be worked around using shell signal trap (yes things start to get a bit dirty already).

If the program tries to read before then we background it, does it still get SIGTTIN? 🤷 Conversely, could we background it too quickly and cause it to miss some of its output?

I think the moment child gets SIGTTIN it gets paused, maybe this can be leveraged somehow. From what I remember POSIX stuff is extremely well designed, if I would have to gamble I put my money on assumption this situation can be handled without race conditions ;). Of course there are two foreign layers (pexpect + bash) in between bash_kernel and a hypothetical sudo - this might easily ruin all the beauty of POSIX, but you never know ;).

Does this behave the way we want for programs like vim/less that can ~always accept input?

I would say lack of proper support for vim/less should not be a show stopper. You have said it yourself, turning bash_kernel into a full-blown terminal emulator is not necessarily a good idea ;).

If you ask me -- it would be really user-friendly to at least have some visual indication that cell execution is blocked because shell command wants some user input. The next level is handling this situation with input() as you have envisioned. Yet another level is support of sudo to avoid password disclosure (it is probably possible to tell if a pty is in non-echo mode and call getpass() instead of input()). Programs using libreadline for prompting is yet another level of complexity I suspect...

PS. I suspect one has to wrap their head around this https://www.gnu.org/software/libc/manual/html_node/Job-Control.html to have a chance to solve this riddle ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants